Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp310612ybl; Tue, 28 Jan 2020 03:41:16 -0800 (PST) X-Google-Smtp-Source: APXvYqxj46dkpYwaj8pLwrZzd0tjT+Vd6AegHC75UchoGPA14+8NriA07eYuV3PAJa6HeNGhhVXM X-Received: by 2002:aca:cc87:: with SMTP id c129mr2437085oig.13.1580211676327; Tue, 28 Jan 2020 03:41:16 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1580211676; cv=none; d=google.com; s=arc-20160816; b=gA0aAU2AF9wS2Db0MhyxwRQzfBRozon0jywI5eL0zbzThNBWMpalu3QFIvuDth/YWX e9VS1aBOQ9WiJh8udJty0YQttP2Hn5ngDj4Vnu1l8y/Gps19cJJJSpSC52u+c7X23UqR 3AvNSLpM3VlU6VKjftaMkrvFH4Oqwcy8NIVimIgDka6pwOOGFl52zDXpYyNp0PB6tVph BIwClEyofNsDDNFsdFRbLOVJL/GBBLDBUt4KCpEWfe1bDCaIpaWevQ716W293u7IEI9E vc7krVQJZKqBqWCv98axTrxJ6QGp494tu6MSeF9s/w23p/OlfFkGyCIhJ3+BteB+dfGL RsHw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=T3Wbgf30Ii+VOg2FGlTD1Ulg3uZcieH89u7q/gSMF7Q=; b=dGGfI4xLMlgVkfkTPEcspsnQNs6IVAfILQmnNge5xR5yJEyH/2FDaYG22tdzPU5Syb oqVjAmKyxs/hpoptjqfbwcrHCxJtOx8LOF2Q1adNtLWcNtReorojhwW4N3E3RIDI3wUT ZbIwezdfstDrOfjjtGe5i3tYe5nZpOwIS4G+ovKSa/lRWqmdf28Y+nTg+fARXlT5mhI8 9L4OogJv96i4hWG+FILsRqBy5x1/7wLLDdilgVEbL6zhI3FOPf2NNUB18ohap07tnMxs WyfKc0Fy3wfQEqIZ1lCJ3mpwC6mc8zc+q7/pWAX5bsArhML55S3WQHatf9q/Os8uzcVq XsjQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i20si8474003otk.270.2020.01.28.03.41.03; Tue, 28 Jan 2020 03:41:16 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726002AbgA1Lj5 (ORCPT + 99 others); Tue, 28 Jan 2020 06:39:57 -0500 Received: from mail-wr1-f68.google.com ([209.85.221.68]:40325 "EHLO mail-wr1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725937AbgA1Lj5 (ORCPT ); Tue, 28 Jan 2020 06:39:57 -0500 Received: by mail-wr1-f68.google.com with SMTP id c14so15632728wrn.7 for ; Tue, 28 Jan 2020 03:39:55 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=T3Wbgf30Ii+VOg2FGlTD1Ulg3uZcieH89u7q/gSMF7Q=; b=b2ke8Tb8inNAk094ut1CQvIywQNw2arDJ9gFUncf6F3JZf6IDiUawbauNAij/hqy0F T7yzXhmayuXVjjBA5+6os1ewu8qYnm8cYdIA36XepvwARk/DnyWlAigAAtFZi+Fxo+fO hC7vgirRV58aeG7NhDEzKzXIW3bUq//I7lg222gvvzlsB3f65T12+5vMht0p/BxbsiTV HVHKKK1a50tewrlq7dprwOMNxZxssAulsGxXpIIQpDqCK6rsMqw+lm849gpnf8THbdqq PC1aAHfTRMEvw19XOHBH2mNOy3WJEuDqphiTusZGzdw6VleMkVRWjQcPMMO0CLnQEk3R eMOA== X-Gm-Message-State: APjAAAVevgZ6/WZ/tFVxWa1dsSwfx3CTRvRffv7BsGqcmU8muF7juDUK R91UBohO1EfAdNR5W5EXdcA= X-Received: by 2002:a5d:56ca:: with SMTP id m10mr29266505wrw.313.1580211595104; Tue, 28 Jan 2020 03:39:55 -0800 (PST) Received: from localhost (prg-ext-pat.suse.com. [213.151.95.130]) by smtp.gmail.com with ESMTPSA id r6sm25656024wrq.92.2020.01.28.03.39.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Jan 2020 03:39:54 -0800 (PST) Date: Tue, 28 Jan 2020 12:39:53 +0100 From: Michal Hocko To: Matthew Wilcox Cc: Cong Wang , LKML , Andrew Morton , linux-mm , Mel Gorman , Vlastimil Babka Subject: Re: [PATCH] mm: avoid blocking lock_page() in kcompactd Message-ID: <20200128113953.GA24244@dhcp22.suse.cz> References: <20200121090048.GG29276@dhcp22.suse.cz> <20200126233935.GA11536@bombadil.infradead.org> <20200127150024.GN1183@dhcp22.suse.cz> <20200127190653.GA8708@bombadil.infradead.org> <20200128081712.GA18145@dhcp22.suse.cz> <20200128083044.GB6615@bombadil.infradead.org> <20200128091352.GC18145@dhcp22.suse.cz> <20200128104857.GC6615@bombadil.infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200128104857.GC6615@bombadil.infradead.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 28-01-20 02:48:57, Matthew Wilcox wrote: > On Tue, Jan 28, 2020 at 10:13:52AM +0100, Michal Hocko wrote: > > On Tue 28-01-20 00:30:44, Matthew Wilcox wrote: > > > On Tue, Jan 28, 2020 at 09:17:12AM +0100, Michal Hocko wrote: > > > > On Mon 27-01-20 11:06:53, Matthew Wilcox wrote: > > > > > On Mon, Jan 27, 2020 at 04:00:24PM +0100, Michal Hocko wrote: > > > > > > On Sun 26-01-20 15:39:35, Matthew Wilcox wrote: > > > > > > > On Sun, Jan 26, 2020 at 11:53:55AM -0800, Cong Wang wrote: > > > > > > > > I suspect the process gets stuck in the retry loop in try_charge(), as > > > > > > > > the _shortest_ stacktrace of the perf samples indicated: > > > > > > > > > > > > > > > > cycles:ppp: > > > > > > > > ffffffffa72963db mem_cgroup_iter > > > > > > > > ffffffffa72980ca mem_cgroup_oom_unlock > > > > > > > > ffffffffa7298c15 try_charge > > > > > > > > ffffffffa729a886 mem_cgroup_try_charge > > > > > > > > ffffffffa720ec03 __add_to_page_cache_locked > > > > > > > > ffffffffa720ee3a add_to_page_cache_lru > > > > > > > > ffffffffa7312ddb iomap_readpages_actor > > > > > > > > ffffffffa73133f7 iomap_apply > > > > > > > > ffffffffa73135da iomap_readpages > > > > > > > > ffffffffa722062e read_pages > > > > > > > > ffffffffa7220b3f __do_page_cache_readahead > > > > > > > > ffffffffa7210554 filemap_fault > > > > > > > > ffffffffc039e41f __xfs_filemap_fault > > > > > > > > ffffffffa724f5e7 __do_fault > > > > > > > > ffffffffa724c5f2 __handle_mm_fault > > > > > > > > ffffffffa724cbc6 handle_mm_fault > > > > > > > > ffffffffa70a313e __do_page_fault > > > > > > > > ffffffffa7a00dfe page_fault > > > > > > > > I am not deeply familiar with the readahead code. But is there really a > > > > high oerder allocation (order > 1) that would trigger compaction in the > > > > phase when pages are locked? > > > > > > Thanks to sl*b, yes: > > > > > > radix_tree_node 80890 102536 584 28 4 : tunables 0 0 0 : slabdata 3662 3662 0 > > > > > > so it's allocating 4 pages for an allocation of a 576 byte node. > > > > I am not really sure that we do sync migration for costly orders. > > Doesn't the stack trace above indicate that we're doing migration as > the result of an allocation in add_to_page_cache_lru()? Which stack trace do you refer to? Because the one above doesn't show much more beyond mem_cgroup_iter and likewise others in this email thread. I do not really remember any stack with lock_page on the trace. > > > > > Btw. the compaction rejects to consider file backed pages when __GFP_FS > > > > is not present AFAIR. > > > > > > Ah, that would save us. > > > > So the NOFS comes from the mapping GFP mask, right? That is something I > > was hoping to get rid of eventually :/ Anyway it would be better to have > > an explicit NOFS with a comment explaining why we need that. If for > > nothing else then for documentation. > > I'd also like to see the mapping GFP mask go away, but rather than seeing > an explicit GFP_NOFS here, I'd rather see the memalloc_nofs API used. Completely agreed agree here. The proper place for the scope would be the place where pages are locked with an explanation that there are other allocations down the line which might invoke sync migration and that would be dangerous. Having that explicitly documented is clearly an improvement. -- Michal Hocko SUSE Labs