Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp2067041imm; Thu, 24 May 2018 05:19:59 -0700 (PDT) X-Google-Smtp-Source: AB8JxZoqbft1FKBSC8qKl9n+hFCIYTgxB7oQ/dUrQ9rbxNrAHR14b5cStur6lJu1Ei2QwQfaeW81 X-Received: by 2002:a63:6887:: with SMTP id d129-v6mr5839341pgc.128.1527164399617; Thu, 24 May 2018 05:19:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527164399; cv=none; d=google.com; s=arc-20160816; b=QToyeJhtCKYV/GVaPFF2NK95YXLOd+T0HfIovvaEOLBoK/1+bdEZhUcuHTj5/FxXay Lvr2B6by45jq2X5NpedaamVMXMcumVJaYWN8PuX3bjtCVqE0v1621ZY9r5glNQw6JZ5S N2qduElpsleLNuldTqKP3qOzMOP4+Fxk65UII+7KhcXJfvIZanXUcS/nf3JwRTyvDezy 49dDg++Yyg+xXAfXjNROVjoi2sk6H6lJb/xWElqi343b/0FaZTpYrLkugPU0+5CumAG4 0eYvono8nIGbhE2+vdK9uSULh4ScKA12h6PkwsMuiHYwUUVtLyk/EHnm2tTERuHzMc3f cUPA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=uMESharmeoCQWFXDsQsf4BYyAojb7Elqm3joBzaODiw=; b=MiNCpd43OBqlSZe68p7DxjTmL17upDXJdBisV4rYYXk1ETOfIrL16nk0XEixRtzRtm 1apYOCZNbAkHNZHacWatjgoE9+zadF7Bh4Bzi4FMlTwCkk9/rYIKMyP6yeA4tUTnxJ+O dNJgKp5Xv4ZzcngTG+vRyt+hOMfsktvFpYmrYMz6SzbAr+o6wdhgBtJFOWoWWhUqPos0 Xjdz9OmJfFHylA/DxocX8tvEreT1QZg9wguGB8SsD4F318PRvHZxf7EgFjgLI8Z0mcDU WTPf04SBppeUfptHAeUFr6EIi7/+SPcsigAHjeYKvMF1r3bmLSEAAT8U8DtMzGQXX2Go ptxQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a68-v6si13457988pgc.210.2018.05.24.05.19.45; Thu, 24 May 2018 05:19:59 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S969980AbeEXMTC (ORCPT + 99 others); Thu, 24 May 2018 08:19:02 -0400 Received: from mx2.suse.de ([195.135.220.15]:43515 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S969964AbeEXMS5 (ORCPT ); Thu, 24 May 2018 08:18:57 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 2FDD8AE35; Thu, 24 May 2018 12:18:55 +0000 (UTC) Date: Thu, 24 May 2018 14:18:53 +0200 From: Michal Hocko To: Huaisheng HS1 Ye Cc: "akpm@linux-foundation.org" , "linux-mm@kvack.org" , "willy@infradead.org" , "vbabka@suse.cz" , "mgorman@techsingularity.net" , "kstewart@linuxfoundation.org" , "alexander.levin@verizon.com" , "gregkh@linuxfoundation.org" , "colyli@suse.de" , NingTing Cheng , Ocean HY1 He , "linux-kernel@vger.kernel.org" , "iommu@lists.linux-foundation.org" , "xen-devel@lists.xenproject.org" , "linux-btrfs@vger.kernel.org" , Christoph Hellwig Subject: Re: [External] Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD Message-ID: <20180524121853.GG20441@dhcp22.suse.cz> References: <1526916033-4877-1-git-send-email-yehs2007@gmail.com> <20180522183728.GB20441@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.5 (2018-04-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed 23-05-18 16:07:16, Huaisheng HS1 Ye wrote: > From: Michal Hocko [mailto:mhocko@kernel.org] > Sent: Wednesday, May 23, 2018 2:37 AM > > > > On Mon 21-05-18 23:20:21, Huaisheng Ye wrote: > > > From: Huaisheng Ye > > > > > > Replace GFP_ZONE_TABLE and GFP_ZONE_BAD with encoded zone number. > > > > > > Delete ___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 from GFP bitmasks, > > > the bottom three bits of GFP mask is reserved for storing encoded > > > zone number. > > > > > > The encoding method is XOR. Get zone number from enum zone_type, > > > then encode the number with ZONE_NORMAL by XOR operation. > > > The goal is to make sure ZONE_NORMAL can be encoded to zero. So, > > > the compatibility can be guaranteed, such as GFP_KERNEL and GFP_ATOMIC > > > can be used as before. > > > > > > Reserve __GFP_MOVABLE in bit 3, so that it can continue to be used as > > > a flag. Same as before, __GFP_MOVABLE respresents movable migrate type > > > for ZONE_DMA, ZONE_DMA32, and ZONE_NORMAL. But when it is enabled with > > > __GFP_HIGHMEM, ZONE_MOVABLE shall be returned instead of ZONE_HIGHMEM. > > > __GFP_ZONE_MOVABLE is created to realize it. > > > > > > With this patch, just enabling __GFP_MOVABLE and __GFP_HIGHMEM is not > > > enough to get ZONE_MOVABLE from gfp_zone. All callers should use > > > GFP_HIGHUSER_MOVABLE or __GFP_ZONE_MOVABLE directly to achieve that. > > > > > > Decode zone number directly from bottom three bits of flags in gfp_zone. > > > The theory of encoding and decoding is, > > > A ^ B ^ B = A > > > > So why is this any better than the current code. Sure I am not a great > > fan of GFP_ZONE_TABLE because of how it is incomprehensible but this > > doesn't look too much better, yet we are losing a check for incompatible > > gfp flags. The diffstat looks really sound but then you just look and > > see that the large part is the comment that at least explained the gfp > > zone modifiers somehow and the debugging code. So what is the selling > > point? > > Dear Michal, > > Let me try to reply your questions. > Exactly, GFP_ZONE_TABLE is too complicated. I think there are two advantages > from the series of patches. > > 1. XOR operation is simple and efficient, GFP_ZONE_TABLE/BAD need to do twice > shift operations, the first is for getting a zone_type and the second is for > checking the to be returned type is a correct or not. But with these patch XOR > operation just needs to use once. Because the bottom 3 bits of GFP bitmask have > been used to represent the encoded zone number, we can say there is no bad zone > number if all callers could use it without buggy way. Of course, the returned > zone type in gfp_zone needs to be no more than ZONE_MOVABLE. But you are losing the ability to check for wrong usage. And it seems that the sad reality is that the existing code do screw up. > 2. GFP_ZONE_TABLE has limit with the amount of zone types. Current GFP_ZONE_TABLE > is 32 bits, in general, there are 4 zone types for most ofX86_64 platform, they > are ZONE_DMA, ZONE_DMA32, ZONE_NORMAL and ZONE_MOVABLE. If we want to expand the > amount of zone types to larger than 4, the zone shift should be 3. But we do not want to expand the number of zones IMHO. The existing zoo is quite a maint. pain. That being said. I am not saying that I am in love with GFP_ZONE_TABLE. It always makes my head explode when I look there but it seems to work with the current code and it is optimized for it. If you want to change this then you should make sure you describe reasons _why_ this is an improvement. And I would argue that "we can have more zones" is a relevant one. -- Michal Hocko SUSE Labs