Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp2491229imu; Sun, 27 Jan 2019 06:16:19 -0800 (PST) X-Google-Smtp-Source: ALg8bN4h3pwvU273+Fw2VhtfoAV8lZkhuOXP/45Hl8o6pamDleaDB9X2ujB/S2gSjRzyey+yk/N8 X-Received: by 2002:a17:902:d911:: with SMTP id c17mr18847620plz.151.1548598579713; Sun, 27 Jan 2019 06:16:19 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548598579; cv=none; d=google.com; s=arc-20160816; b=D4zf4AU1BulYAj9EV59K/txFm1MkZ0AoJx9EEbEfsQuJfJJKNZ3Z/C0kUWAuu9iayO 2desJwF/hOUzoAqb6qqKf5mbd/C16w9HqZgtBq+dGZw9MR1qL2dgqLDDG9aJSN4lhHI3 ZESUgbQJFZ/h7lJUB3Zsk/pHN3UNKA0ALW413B8KVXa3H7K1as6YHjStG0eZmqpo1N+w 6KgSPq6D8LqO1fGKo+t2YWOwzkChscM0isgoWi4ouM9/U5bmz4g64Tu90DAJLqgtbl9G FQxAZt9QF4nKBkI+kpyj0hswKWeeP4su3xYY/FFrQ6O3Cg+fk9gg50Yv2uGX595jeUmI 7k2Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=W8mThXuhBBTk+9YgZ04zBcCMJwQejqIYFfqeLC3iwrs=; b=sCXDDYRvvxot4z6wJ01rYvvZDi+T6a1sLUX65NIGsa6T06+6fiiBdIKXQKZpxQif85 YYq24qzj4YOtW5a9ep2AnFutYE+VMPzNnBjMjCHxgPBjBCKkDd9mf+WvBz5bYfp02r/0 Cyg3Nw0CKppnD/iaC/JQjs6FgaXQpGVFbamcdUDgtCJ2EnOnBbavaTzn3119AGuSniVx P7wqM5hjMRli7hzflrzmEDwxBDR+zsF5Zp7xUUXlXowFq11OLRQn04hfnVwqMLqV8On/ WNwBopSGDG/QscPkWXS387lhGyyq0BKX8jWsSXhpLdJwJvLdTr2i1Hcw+CjIgI+8unyQ RisA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y141si1948635pfc.180.2019.01.27.06.16.03; Sun, 27 Jan 2019 06:16:19 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726632AbfA0OQA (ORCPT + 99 others); Sun, 27 Jan 2019 09:16:00 -0500 Received: from outbound-smtp10.blacknight.com ([46.22.139.15]:53727 "EHLO outbound-smtp10.blacknight.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726511AbfA0OQA (ORCPT ); Sun, 27 Jan 2019 09:16:00 -0500 Received: from mail.blacknight.com (pemlinmail03.blacknight.ie [81.17.254.16]) by outbound-smtp10.blacknight.com (Postfix) with ESMTPS id D50C11C31EB for ; Sun, 27 Jan 2019 14:15:57 +0000 (GMT) Received: (qmail 9036 invoked from network); 27 Jan 2019 14:15:57 -0000 Received: from unknown (HELO techsingularity.net) (mgorman@techsingularity.net@[37.228.225.79]) by 81.17.254.9 with ESMTPSA (AES256-SHA encrypted, authenticated); 27 Jan 2019 14:15:57 -0000 Date: Sun, 27 Jan 2019 14:15:56 +0000 From: Mel Gorman To: valdis.kletnieks@vt.edu Cc: Pavel Machek , kernel list , Andrew Morton , vbabka@suse.cz, aarcange@redhat.com, rientjes@google.com, mhocko@kernel.org, zi.yan@cs.rutgers.edu, hannes@cmpxchg.org, jack@suse.cz Subject: Re: [regression -next0117] What is kcompactd and why is he eating 100% of my cpu? Message-ID: <20190127141556.GB9565@techsingularity.net> References: <20190126200005.GB27513@amd> <12171.1548557813@turing-police.cc.vt.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <12171.1548557813@turing-police.cc.vt.edu> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Jan 26, 2019 at 09:56:53PM -0500, valdis.kletnieks@vt.edu wrote: > On Sat, 26 Jan 2019 21:00:05 +0100, Pavel Machek said: > > > top - 13:38:51 up 1:42, 16 users, load average: 1.41, 1.93, 1.62 > > Tasks: 182 total, 3 running, 138 sleeping, 0 stopped, 0 zombie > > %Cpu(s): 2.3 us, 57.8 sy, 0.0 ni, 39.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st > > KiB Mem: 3020044 total, 2429420 used, 590624 free, 27468 buffers > > KiB Swap: 2097148 total, 0 used, 2097148 free. 1924268 cached Mem > > > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > > 608 root 20 0 0 0 0 R 99.6 0.0 11:34.38 kcompactd0 > > 9782 root 20 0 0 0 0 I 7.9 0.0 0:59.02 kworker/0: > > 2971 root 20 0 46624 23076 13576 S 4.3 0.8 2:50.22 Xorg > > I've noticed this as well on earlier kernels (next-20181224 to 20190115) > > Some more info: > > 1) echo 3 > /proc/sys/vm/drop_caches unwedges kcompactd in 1-3 seconds. > This aspect is curious as it indicates that kcompactd could potentially be infinite looping but it's not something I've experienced myself. By any chance is there a preditable reproduction case for this? > I've also seen khugepaged hung up: > > cat /proc/29/stack > [<0>] ___preempt_schedule+0x16/0x18 > [<0>] page_vma_mapped_walk+0x60/0x840 > [<0>] remove_migration_pte+0x67/0x390 > [<0>] rmap_walk_file+0x186/0x380 > [<0>] rmap_walk+0xa3/0xd0 > [<0>] remove_migration_ptes+0x69/0x70 > [<0>] migrate_pages+0xb6d/0xfd8 > [<0>] compact_zone+0xb70/0x1370 > [<0>] compact_zone_order+0xd8/0x120 > [<0>] try_to_compact_pages+0xe5/0x550 > [<0>] __alloc_pages_direct_compact+0x6d/0x1a0 > [<0>] __alloc_pages_slowpath+0x6c9/0x1640 > [<0>] __alloc_pages_nodemask+0x558/0x5b0 > [<0>] khugepaged+0x499/0x810 > [<0>] kthread+0x158/0x170 > [<0>] ret_from_fork+0x3a/0x50 > [<0>] 0xffffffffffffffff > > Looks like something has gone astray with compact_zone. > It's a possibility that the buffer aspect of the trace is a red herring and there is some corner case that prevents the migration scan/free scanner meeting and exiting compaction. Again, a reproduction case of some sort would be nice or an indication of how long it takes to trigger. An update of the series is due which may or may not fix this but if it doesn't, we'll need to start tracing this to see what's going on at the point of failure. -- Mel Gorman SUSE Labs