Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp9368197imu; Wed, 5 Dec 2018 03:46:10 -0800 (PST) X-Google-Smtp-Source: AFSGD/U36iZDngzuDeR8PlOpFKwUqzeRG5Kz/YcIqsxOsgbp1RpPWDjpculC3aII2Jcxlg90E73I X-Received: by 2002:a63:5518:: with SMTP id j24mr19821228pgb.208.1544010370211; Wed, 05 Dec 2018 03:46:10 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1544010370; cv=none; d=google.com; s=arc-20160816; b=EzZc26KdWDGRb3gDSGkb1SbY4AM6pDWxgucnQPaAb+mBJzMvKj9f0adJt2UAXJXmP+ 9LYE/9NBfw/e5rbhtgP8NawwaZlJB9rjwj8jWreWefnImJljc9FXqDzggtnOyVg8Yx31 C84RSf7XQ59tmzGqbwWTUlBGUHJ+Fgb57jFmecMBtDf+sh7fr3aySgXbhEqYwgLQJ/78 dCCApFz4ge37kf0Loe8SUxUVQ2TbPfSHH2UJaX7bKuFX8DExTFWMTQ+CCgOWCGnYkY8X nCMaA8qXSW3rsGpzs3EvM9AYFVfuGnt86HLOsLC1LNCpN4k9dfK2OP3UAoEkuTyGliTg UEpw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=n9NpaKtNPwQN+WqsY5uWp5RVDL73XVsIA+Dh0QtWNv4=; b=CfDNSLpcGtYj/P4P86TcHjPmbPAqIeKx7I8cl5mQ8VBEtemmwWOiC8zVRU2L5omyFY DaHeYwmhvZ4xtlswLztGlel6aJYrgRDJNPEgI4Sa0oRfeEd5+7xS/8t1v4lCqo1wBLu1 HY8n6agtT/ix8C4oNk84S3b5W50WgayY3LD75LMQGwsd+E6ZL94j1+qCvXw4/M8gpIqd qQzzSBgHAMjw81yyxskucNkYdA9HZM3qSyfoBBbiDRUpPAP4nh8qZyl+Xd3MIL/SAfcU W/xXFol8sHneofuwKW3NVE7b5Za4jNOrMDWs5Y3xqpPVlmLk9X2J++UkPgQQ9/Wu5sLp w89A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 33si7058933plt.228.2018.12.05.03.45.55; Wed, 05 Dec 2018 03:46:10 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727698AbeLELn6 (ORCPT + 99 others); Wed, 5 Dec 2018 06:43:58 -0500 Received: from mx2.suse.de ([195.135.220.15]:34756 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726866AbeLELn6 (ORCPT ); Wed, 5 Dec 2018 06:43:58 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 42E8FB020; Wed, 5 Dec 2018 11:43:55 +0000 (UTC) Date: Wed, 5 Dec 2018 12:43:53 +0100 From: Michal Hocko To: Mel Gorman Cc: David Rientjes , Vlastimil Babka , Linus Torvalds , Andrea Arcangeli , ying.huang@intel.com, s.priebe@profihost.ag, Linux List Kernel Mailing , alex.williamson@redhat.com, lkp@01.org, kirill@shutemov.name, Andrew Morton , zi.yan@cs.rutgers.edu Subject: Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression Message-ID: <20181205114353.GH1286@dhcp22.suse.cz> References: <20181203185954.GM31738@dhcp22.suse.cz> <20181203201214.GB3540@redhat.com> <64a4aec6-3275-a716-8345-f021f6186d9b@suse.cz> <20181204104558.GV23260@techsingularity.net> <20181205090856.GY1286@dhcp22.suse.cz> <20181205104343.GZ23260@techsingularity.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20181205104343.GZ23260@techsingularity.net> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed 05-12-18 10:43:43, Mel Gorman wrote: > On Wed, Dec 05, 2018 at 10:08:56AM +0100, Michal Hocko wrote: > > On Tue 04-12-18 16:47:23, David Rientjes wrote: > > > On Tue, 4 Dec 2018, Mel Gorman wrote: > > > > > > > What should also be kept in mind is that we should avoid conflating > > > > locality preferences with THP preferences which is separate from THP > > > > allocation latencies. The whole __GFP_THISNODE approach is pushing too > > > > hard on locality versus huge pages when MADV_HUGEPAGE or always-defrag > > > > are used which is very unfortunate given that MADV_HUGEPAGE in itself says > > > > nothing about locality -- that is the business of other madvise flags or > > > > a specific policy. > > > > > > We currently lack those other madvise modes or mempolicies: mbind() is not > > > a viable alternative because we do not want to oom kill when local memory > > > is depleted, we want to fallback to remote memory. > > > > Yes, there was a clear agreement that there is no suitable mempolicy > > right now and there were proposals to introduce MPOL_NODE_RECLAIM to > > introduce that behavior. This would be an improvement regardless of THP > > because global node-reclaim policy was simply a disaster we had to turn > > off by default and the global semantic was a reason people just gave up > > using it completely. > > > > The alternative is to define a clear semantic for THP allocation > requests that are considered "light" regardless of whether that needs a > GFP flag or not. A sensible default might be > > o Allocate THP local if the amount of work is light or non-existant. > o Allocate THP remote if one is freely available with no additional work > (maybe kick remote kcompactd) > o Allocate base page local if the amount of work is light or non-existant > o Allocate base page remote if the amount of work is light or non-existant > o Do heavy work in zonelist order until a base page is allocated somewhere I am not sure about the ordering without a deeper consideration but I thin THP should reflect the approach we have for base bages. > It's not something could be clearly expressed with either NORETRY or > THISNODE but longer-term might be saner than chopping and changing on > which flags are more important and which workload is most relevant. That > runs the risk of a revert-loop where each person targetting one workload > reverts one patch to insert another until someone throws up their hands > in frustration and just carries patches out-of-tree long-term. Fully agreed! > I'm not going to prototype something along these lines for now as > fundamentally a better compaction could cut out part of the root cause > of pain. Yes there is some ground work to be done first. -- Michal Hocko SUSE Labs