Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp1682044imm; Wed, 23 May 2018 22:21:39 -0700 (PDT) X-Google-Smtp-Source: AB8JxZqt9H3tFZuGL8bLto+kJ7hRnRmP02C1mLCFGBDQ+jq1soAwJw7fFnFhRC67sTyJCUrWwUgT X-Received: by 2002:a63:7f4e:: with SMTP id p14-v6mr1225800pgn.27.1527139299029; Wed, 23 May 2018 22:21:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527139298; cv=none; d=google.com; s=arc-20160816; b=w0Ejz1RIrgrJVZaK1JQMcuE3zzXjzAY1FuwGCUYO6dIqMBoS2kDgEdY2JeKnKRrTeb /gHozFJA59mLygmmNB75U/c4U3loTnO/HpQgQw9fyxxAos9y1SkspJtXYVCsf6mRmOAA 64J6Q7rX8X3jlJiZmsBtujqA+nDm3u8ae13fII0kGHMdkYdrPB/hPCCyvqZx9EuWP9xw 2dAoQvQ+hQByU3xvlh2Tk/cRHjLxjRonyBRMks5LourB2Xwbd9TIFjIcLzQU7jgLSdJr l9RgnodznYQhQKTw2I+vOIvpOvWGqzdZoBjcsSNUg6UJIBjRyGe5iOURQux2SOXXzj/G Jvfw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=N7i88fJ8JedeN1ByUAy346KO0+/p00d+ltO+gSj6UB4=; b=NoTvvZTOViDVIil+Tr3871Qziecy3b3AMcbYkdCsE8AzoEMWCFlvrD3/Zo0Wxa+XI8 RjZTMNSqDh3m7H2N0vFkKQNKU6u++cAO4jtWjedJ4nABZsFR4IIf0VhHhp3QR6R7ATeq MckWWZ5YoHjH9vg32hBhJbuVHzJrDY0NHx2CslStL9NlGv4P5G5RMxUSsZ4hbzX0Amc4 B7k850GHYeTdO6Oz+iGgJic1P5S1uFTwdKiu5solqa1RzjLAaPO/SKMuM4Q1Iye1mdMB kvq0aIm67Lowc8B19oLKxzCmTuFllYO+5idUVfuetWy92DH66OxeeG+hV+PJ7aviAnoy mrYw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=kv+AGfFK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k18-v6si20433447pfe.13.2018.05.23.22.21.23; Wed, 23 May 2018 22:21:38 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=kv+AGfFK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754955AbeEXFTc (ORCPT + 99 others); Thu, 24 May 2018 01:19:32 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:57214 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754934AbeEXFTb (ORCPT ); Thu, 24 May 2018 01:19:31 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=N7i88fJ8JedeN1ByUAy346KO0+/p00d+ltO+gSj6UB4=; b=kv+AGfFKvSVwpxxt3/HdVENX6 FZ//rCMgy3hRUdNtUSfiubNNT2v1W1YGaW6+poV/LjQGiwk/LU2lspIis0aVOnS9skj5YvEqMq1+t HGQY/zRZmd9vHvwabSUC1nJ74dfu3OepF7WcwtCOgABEfu77qYhZk51MCVjLfizxelvzBMA6H/qq0 fyrIzrjIw6tlDDehLXbRAadKtD4rwtyo9bmaq/rcmMlBNi1iM9CpAbBoLMQgGfkVtOHRikJU/nOZi V5Biha/DTR52UjFDNV8JiG+TCDvH7qswmE/rUTm2O7u+we6vS8Dz5e4Ng/5MSNzCNFbFYIw+kzvsp lbYr380HA==; Received: from willy by bombadil.infradead.org with local (Exim 4.90_1 #2 (Red Hat Linux)) id 1fLiel-0006Tr-Qp; Thu, 24 May 2018 05:19:19 +0000 Date: Wed, 23 May 2018 22:19:19 -0700 From: Matthew Wilcox To: Michal Hocko Cc: Huaisheng Ye , akpm@linux-foundation.org, linux-mm@kvack.org, vbabka@suse.cz, mgorman@techsingularity.net, kstewart@linuxfoundation.org, alexander.levin@verizon.com, gregkh@linuxfoundation.org, colyli@suse.de, chengnt@lenovo.com, hehy1@lenovo.com, linux-kernel@vger.kernel.org, iommu@lists.linux-foundation.org, xen-devel@lists.xenproject.org, linux-btrfs@vger.kernel.org, Huaisheng Ye Subject: Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD Message-ID: <20180524051919.GA9819@bombadil.infradead.org> References: <1526916033-4877-1-git-send-email-yehs2007@gmail.com> <20180522183728.GB20441@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180522183728.GB20441@dhcp22.suse.cz> User-Agent: Mutt/1.9.2 (2017-12-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 22, 2018 at 08:37:28PM +0200, Michal Hocko wrote: > So why is this any better than the current code. Sure I am not a great > fan of GFP_ZONE_TABLE because of how it is incomprehensible but this > doesn't look too much better, yet we are losing a check for incompatible > gfp flags. The diffstat looks really sound but then you just look and > see that the large part is the comment that at least explained the gfp > zone modifiers somehow and the debugging code. So what is the selling > point? I have a plan, but it's not exactly fully-formed yet. One of the big problems we have today is that we have a lot of users who have constraints on the physical memory they want to allocate, but we have very limited abilities to provide them with what they're asking for. The various different ZONEs have different meanings on different architectures and are generally a mess. If we had eight ZONEs, we could offer: ZONE_16M // 24 bit ZONE_256M // 28 bit ZONE_LOWMEM // CONFIG_32BIT only ZONE_4G // 32 bit ZONE_64G // 36 bit ZONE_1T // 40 bit ZONE_ALL // everything larger ZONE_MOVABLE // movable allocations; no physical address guarantees #ifdef CONFIG_64BIT #define ZONE_NORMAL ZONE_ALL #else #define ZONE_NORMAL ZONE_LOWMEM #endif This would cover most driver DMA mask allocations; we could tweak the offered zones based on analysis of what people need. #define GFP_HIGHUSER (GFP_USER | ZONE_ALL) #define GFP_HIGHUSER_MOVABLE (GFP_USER | ZONE_MOVABLE) One other thing I want to see is that fallback from zones happens from highest to lowest normally (ie if you fail to allocate in 1T, then you try to allocate from 64G), but movable allocations hapen from lowest to highest. So ZONE_16M ends up full of page cache pages which are readily evictable for the rare occasions when we need to allocate memory below 16MB. I'm sure there are lots of good reasons why this won't work, which is why I've been hesitant to propose it before now.