Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp4136850ybl; Mon, 27 Jan 2020 17:27:01 -0800 (PST) X-Google-Smtp-Source: APXvYqxSwnDLzJLZ4KfvSMKGq58ezXE8KEUR2xcU3fGagACX0S5WJv208IFEwjNT+ZlP/WjrPIqC X-Received: by 2002:aca:5490:: with SMTP id i138mr1363845oib.69.1580174821582; Mon, 27 Jan 2020 17:27:01 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1580174821; cv=none; d=google.com; s=arc-20160816; b=WfRWFm21oVgi8Ah3nlFIf8JnPI6GPJRp8GrzpqUiDbGyJTbOde9aGKzdj9OFM+fuJz luukdmnfMNSWEXIJxuxX6skLDgwbUcOET7/gjgkkTLwb5oCtOK19D665mWY0JpXpIqd0 pvRP9/oJHsrJsYgI3StnC0EzvUzqXLbIBG2QHrhUZalFNB0n7PZvCjkAl70N7OR4kGVq gxF1zPAyT6BPJE+vtSjsbbMRQPAC15CT9weNDPoJNxgAB/JBoD236Oxi2eDQgUzsqYs4 i1dIjmIwsrjraNMw9fab3k8A+piH+6Alih/oElIxMT/itQ0/yib3FmBQLy0ofqmAL1vm oklw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=d1W1vkWaZeQl3eUiDWaI95F64UnK7GsVRNE+cevZnQQ=; b=q9pax2NDan4ksJ0vTQ/2B3iEYwdwBnhw42BlXuEJ/RmF9q4lfhnOthzylRnrmt4fWo CfQn7VOyT5R2RKOEP79tMxrU09XNaDD+iuzcetdB4siG3aYCpD5npWpENpdU3UZ6ftv2 fw23Wc1lsnZB7WIjIpJI7JbwZUp/Y2fWPU7rsyIBp+q6YHhlN0IxbHhKzzViwwpgpvss oJpIu/QT45xyXmRTojdo9WdITnQZwls5iQ3AWq7dIeVLQ2DQyVXC7VpeczA1A26/3AV9 mWlZm4Tm4awCf5Yb1K3w5pqSA9kCIuVBSmOiI271oHZR10iYe+uFDMEWTMqBZED6UH67 rJ1w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=b80p22Te; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g6si7879857otk.171.2020.01.27.17.26.44; Mon, 27 Jan 2020 17:27:01 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=b80p22Te; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726438AbgA1BZa (ORCPT + 99 others); Mon, 27 Jan 2020 20:25:30 -0500 Received: from mail-ed1-f68.google.com ([209.85.208.68]:38584 "EHLO mail-ed1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726099AbgA1BZa (ORCPT ); Mon, 27 Jan 2020 20:25:30 -0500 Received: by mail-ed1-f68.google.com with SMTP id p23so4124295edr.5 for ; Mon, 27 Jan 2020 17:25:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=d1W1vkWaZeQl3eUiDWaI95F64UnK7GsVRNE+cevZnQQ=; b=b80p22TeebE8zkZVbGzkhSVQQ02zgFmoIUBVH/UbmNIVGoYbJgUZjMauRgDNPIuZMM HXnjIVw3u80cNvFgdOSdrBdrYPtP5kPHoknfAdfHBwGAONB/aMwdVIA9akdkIYJnbVSV T7r4RF8hPlekyEJMzRe7oAf/Gkc0DmbF4fp8d7ZYQFLKzNvZm4hIYjYsotvJxvE3lrI/ mVHt/BWXKz4gzKGQJYnfL+bwJ5sJzD4Gr/zKqAvnPTZlNf7iU/5nkWdFYtWAehz8pHqV qSDQ+xn4Msu8ca5jwl6LC7Qfx9EQI6ETRn+Pc9SJl6qXPRi1dJrENEUfHacaXG3zfGKr 78TA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=d1W1vkWaZeQl3eUiDWaI95F64UnK7GsVRNE+cevZnQQ=; b=mAJcLITHlHktYhDpRoBNfDBWCIExxJI6V5KNDNa3Vg5c0f3moge6z55eYhtClTopAs LVQQDGE/5KMRSl0jpAu7Xq0nd7gMW5MFhvkX4vona/zLqaSx0vTDpFJi4RaZ3OCB0Q2f oATVre1V8Rm1KncmuUEgkA8KVzkN+8vu2QnFywOWuQgMSpghhsFsV+Csd9WrEmOG4zyk TdtIAXu9SXeJhL5yUyP62SivavqpCUUQf55JXjqY+Cnq8UiGAem9Rc6QEnE13EhaImQe l2tcNwU3scFX96bgFAnMG7MsF8nrLSABR75/p8wThJovsT5svH/QvWN3fgMmYvjE/eY5 aGoA== X-Gm-Message-State: APjAAAVxNg6gNLxDKIjjryTSOlMhcVMqxdqFLiIBwZGbqsCV/25SyTNU 6Mjn+3DkZolDKFTmFicxG6Jpl3BWf4MvXhx82Jo= X-Received: by 2002:a50:decd:: with SMTP id d13mr1345389edl.372.1580174728463; Mon, 27 Jan 2020 17:25:28 -0800 (PST) MIME-Version: 1.0 References: <20200109225646.22983-1-xiyou.wangcong@gmail.com> <20200110073822.GC29802@dhcp22.suse.cz> <20200121090048.GG29276@dhcp22.suse.cz> <20200126233935.GA11536@bombadil.infradead.org> <20200127150024.GN1183@dhcp22.suse.cz> <20200127190653.GA8708@bombadil.infradead.org> In-Reply-To: <20200127190653.GA8708@bombadil.infradead.org> From: Yang Shi Date: Mon, 27 Jan 2020 17:25:13 -0800 Message-ID: Subject: Re: [PATCH] mm: avoid blocking lock_page() in kcompactd To: Matthew Wilcox Cc: Michal Hocko , Cong Wang , LKML , Andrew Morton , linux-mm , Mel Gorman , Vlastimil Babka Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jan 27, 2020 at 11:06 AM Matthew Wilcox wrote: > > On Mon, Jan 27, 2020 at 04:00:24PM +0100, Michal Hocko wrote: > > On Sun 26-01-20 15:39:35, Matthew Wilcox wrote: > > > On Sun, Jan 26, 2020 at 11:53:55AM -0800, Cong Wang wrote: > > > > I suspect the process gets stuck in the retry loop in try_charge(), as > > > > the _shortest_ stacktrace of the perf samples indicated: > > > > > > > > cycles:ppp: > > > > ffffffffa72963db mem_cgroup_iter > > > > ffffffffa72980ca mem_cgroup_oom_unlock > > > > ffffffffa7298c15 try_charge > > > > ffffffffa729a886 mem_cgroup_try_charge > > > > ffffffffa720ec03 __add_to_page_cache_locked > > > > ffffffffa720ee3a add_to_page_cache_lru > > > > ffffffffa7312ddb iomap_readpages_actor > > > > ffffffffa73133f7 iomap_apply > > > > ffffffffa73135da iomap_readpages > > > > ffffffffa722062e read_pages > > > > ffffffffa7220b3f __do_page_cache_readahead > > > > ffffffffa7210554 filemap_fault > > > > ffffffffc039e41f __xfs_filemap_fault > > > > ffffffffa724f5e7 __do_fault > > > > ffffffffa724c5f2 __handle_mm_fault > > > > ffffffffa724cbc6 handle_mm_fault > > > > ffffffffa70a313e __do_page_fault > > > > ffffffffa7a00dfe page_fault > > > > > > > > But I don't see how it could be, the only possible case is when > > > > mem_cgroup_oom() returns OOM_SUCCESS. However I can't > > > > find any clue in dmesg pointing to OOM. These processes in the > > > > same memcg are either running or sleeping (that is not exiting or > > > > coredump'ing), I don't see how and why they could be selected as > > > > a victim of OOM killer. I don't see any signal pending either from > > > > their /proc/X/status. > > > > > > I think this is a situation where we might end up with a genuine deadlock > > > if we're not trylocking the pages. readahead allocates a batch of > > > locked pages and adds them to the pagecache. If it has allocated, > > > say, 5 pages, successfully inserted the first three into i_pages, then > > > needs to allocate memory to insert the fourth one into i_pages, and > > > the process then attempts to migrate the pages which are still locked, > > > they will never come unlocked because they haven't yet been submitted > > > to the filesystem for reading. > > > > Just to make sure I understand. Do you mean this? > > lock_page(A) > > alloc_pages > > try_to_compact_pages > > compact_zone_order > > compact_zone(MIGRATE_SYNC_LIGHT) > > migrate_pages > > unmap_and_move > > __unmap_and_move > > lock_page(A) > > Yes. There's a little more to it than that, eg slab is involved, but > you have it in a nutshell. But, how compact could get blocked for readahead page if it is not on LRU? The page is charged before adding to LRU, so if kernel just retry charge or reclaim forever, the page should be not on LRU, so it should not block compaction. >