Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp3601375ybl; Mon, 27 Jan 2020 07:01:49 -0800 (PST) X-Google-Smtp-Source: APXvYqxW4TKac7zPD1Tei5Imh83/DMSXiP5eXvQKeWN8CsrmTNkkV6iYFcbQvP70R5FHLXM5cl6d X-Received: by 2002:a9d:5784:: with SMTP id q4mr13289666oth.278.1580137309092; Mon, 27 Jan 2020 07:01:49 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1580137309; cv=none; d=google.com; s=arc-20160816; b=oY5HqnW0jPJs2+tGYW4n/1V1TueACTcU0ElwkxwHEYn6jBKSTqCHjhzOIqc+37aBwK ysq1Lu3KVqy9ne4YqEg1p6r9xMygRZnBKF50W8GUJ9NFEci8C9of6El7q53fCHkcMC6i xFpEypsaToScyBuOBzTWm/biv1jQaa2jQDXDqpqfksGIc2JESJz3FRMfOYBG+9BAC09V zdIDyiVqPjmRUDqRIwYhXhbXkALU/AAzkJmy94kagw+5YLN0k0jz907oLKe0Q64KIgoI WMEYTcvoqjiXrfLkJGONOXMLFgd622pUHovSwOdKS5VmM8WQ00G+1CtrThE4SiCymYic uIiQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=MPyJz022JuT6nWadTd9aeXgqhFJalpTc13NsnVwngMg=; b=sBIZz3yUs9DsxJ28s662qAXH346g6dTmqTDE6Mv8g5hy3Yz1YdkIHS6BZXoiMWRdWc LjUFZ5/Ch0WzZdsS4Gq9Qd6Fz9s/3y7S8TQvQI2TzZ8gKObY1YXwMqIhcywsil6h723u RRMKu5MBlrUAY0y9sLlED8w9+MLE22td9ue8kuzG74rNIEZ9uGeY6b3+8MxC4FVBMgoc WdJumeP5CPSvPKRG0Hgv1/BxKc5M/v8Wim8/uu/8ZnV0P2OOs4HFj4VDsQjhuGBoRKG2 l0WbxZD7BKBpTshq7Z8uNFPzK529lyE3Y1XG61SaChQGJySbE6nHHPlvgqnYJO+4eMBR GHYw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g26si7127018otk.324.2020.01.27.07.01.35; Mon, 27 Jan 2020 07:01:49 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729191AbgA0PA3 (ORCPT + 99 others); Mon, 27 Jan 2020 10:00:29 -0500 Received: from mail-wr1-f68.google.com ([209.85.221.68]:46927 "EHLO mail-wr1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726443AbgA0PA2 (ORCPT ); Mon, 27 Jan 2020 10:00:28 -0500 Received: by mail-wr1-f68.google.com with SMTP id z7so11658660wrl.13 for ; Mon, 27 Jan 2020 07:00:27 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=MPyJz022JuT6nWadTd9aeXgqhFJalpTc13NsnVwngMg=; b=rK9f/Z4HWkd5rGhZNPpHT5NiGf/DPpasq6jFgB9IssqMeB3vrXFkXobU5Sp6zRaFlW LSGF0OSmIEnBBk6ufLeKfks6RVgl6QluZOa754C8UaZ9xncCp/OyEgV0ydIMLPI6mjmm n+SP32GpG/al9nyPrBbxasVhmUbFsPSNPq2JTZO1Hb9kI3dbfm1p6n9/6/l/I1RveKhK jkRQq26yrsOPIKVs9E+k7JpiVTJ04i3N9NeVthwid6XA+/TPdYM7GKhWu4vzVyB1rg2s cA1pxGIQVDk8oqRAg3ThuD75jerKiG0iuYaSY3+jlZqE0a02WNnQAJzMli4vmqWcA3cD U5tA== X-Gm-Message-State: APjAAAU38RH0XtqBwHpAZjtyNBLGs+drkFjnxiWOcAj3apX5f6LjnvvZ z5PJeaYIJtMmwZT4uT2mTSI= X-Received: by 2002:a5d:4cc9:: with SMTP id c9mr22186112wrt.70.1580137226712; Mon, 27 Jan 2020 07:00:26 -0800 (PST) Received: from localhost (prg-ext-pat.suse.com. [213.151.95.130]) by smtp.gmail.com with ESMTPSA id d14sm22645106wru.9.2020.01.27.07.00.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Jan 2020 07:00:25 -0800 (PST) Date: Mon, 27 Jan 2020 16:00:24 +0100 From: Michal Hocko To: Matthew Wilcox Cc: Cong Wang , LKML , Andrew Morton , linux-mm , Mel Gorman , Vlastimil Babka Subject: Re: [PATCH] mm: avoid blocking lock_page() in kcompactd Message-ID: <20200127150024.GN1183@dhcp22.suse.cz> References: <20200109225646.22983-1-xiyou.wangcong@gmail.com> <20200110073822.GC29802@dhcp22.suse.cz> <20200121090048.GG29276@dhcp22.suse.cz> <20200126233935.GA11536@bombadil.infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200126233935.GA11536@bombadil.infradead.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun 26-01-20 15:39:35, Matthew Wilcox wrote: > On Sun, Jan 26, 2020 at 11:53:55AM -0800, Cong Wang wrote: > > On Tue, Jan 21, 2020 at 1:00 AM Michal Hocko wrote: > > > > > > On Mon 20-01-20 14:48:05, Cong Wang wrote: > > > > It got stuck somewhere along the call path of mem_cgroup_try_charge(), > > > > and the trace events of mm_vmscan_lru_shrink_inactive() indicates this > > > > too: > > > > > > So it seems that you are condending on the page lock. It is really > > > unexpected that the reclaim would take that long though. Please try to > > > enable more vmscan tracepoints to see where the time is spent. > > > > I suspect the process gets stuck in the retry loop in try_charge(), as > > the _shortest_ stacktrace of the perf samples indicated: > > > > cycles:ppp: > > ffffffffa72963db mem_cgroup_iter > > ffffffffa72980ca mem_cgroup_oom_unlock > > ffffffffa7298c15 try_charge > > ffffffffa729a886 mem_cgroup_try_charge > > ffffffffa720ec03 __add_to_page_cache_locked > > ffffffffa720ee3a add_to_page_cache_lru > > ffffffffa7312ddb iomap_readpages_actor > > ffffffffa73133f7 iomap_apply > > ffffffffa73135da iomap_readpages > > ffffffffa722062e read_pages > > ffffffffa7220b3f __do_page_cache_readahead > > ffffffffa7210554 filemap_fault > > ffffffffc039e41f __xfs_filemap_fault > > ffffffffa724f5e7 __do_fault > > ffffffffa724c5f2 __handle_mm_fault > > ffffffffa724cbc6 handle_mm_fault > > ffffffffa70a313e __do_page_fault > > ffffffffa7a00dfe page_fault > > > > But I don't see how it could be, the only possible case is when > > mem_cgroup_oom() returns OOM_SUCCESS. However I can't > > find any clue in dmesg pointing to OOM. These processes in the > > same memcg are either running or sleeping (that is not exiting or > > coredump'ing), I don't see how and why they could be selected as > > a victim of OOM killer. I don't see any signal pending either from > > their /proc/X/status. > > I think this is a situation where we might end up with a genuine deadlock > if we're not trylocking the pages. readahead allocates a batch of > locked pages and adds them to the pagecache. If it has allocated, > say, 5 pages, successfully inserted the first three into i_pages, then > needs to allocate memory to insert the fourth one into i_pages, and > the process then attempts to migrate the pages which are still locked, > they will never come unlocked because they haven't yet been submitted > to the filesystem for reading. Just to make sure I understand. Do you mean this? lock_page(A) alloc_pages try_to_compact_pages compact_zone_order compact_zone(MIGRATE_SYNC_LIGHT) migrate_pages unmap_and_move __unmap_and_move lock_page(A) -- Michal Hocko SUSE Labs