Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753488AbdCOMxm (ORCPT ); Wed, 15 Mar 2017 08:53:42 -0400 Received: from mx1.redhat.com ([209.132.183.28]:33672 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751037AbdCOMxP (ORCPT ); Wed, 15 Mar 2017 08:53:15 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com F1F0980B56 Authentication-Results: ext-mx04.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx04.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=vkuznets@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com F1F0980B56 From: Vitaly Kuznetsov To: Michal Hocko Cc: linux-mm@kvack.org, Mel Gorman , qiuxishi@huawei.com, toshi.kani@hpe.com, xieyisheng1@huawei.com, slaoub@gmail.com, iamjoonsoo.kim@lge.com, Zhang Zhen , Reza Arbab , Yasuaki Ishimatsu , Tang Chen , Vlastimil Babka , Andrea Arcangeli , LKML , Andrew Morton , David Rientjes , Daniel Kiper , Igor Mammedov , Andi Kleen Subject: Re: ZONE_NORMAL vs. ZONE_MOVABLE References: <20170315091347.GA32626@dhcp22.suse.cz> <87shmedddm.fsf@vitty.brq.redhat.com> <20170315122914.GG32620@dhcp22.suse.cz> Date: Wed, 15 Mar 2017 13:53:09 +0100 In-Reply-To: <20170315122914.GG32620@dhcp22.suse.cz> (Michal Hocko's message of "Wed, 15 Mar 2017 13:29:14 +0100") Message-ID: <87k27qd7m2.fsf@vitty.brq.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Wed, 15 Mar 2017 12:53:15 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3386 Lines: 84 Michal Hocko writes: > On Wed 15-03-17 11:48:37, Vitaly Kuznetsov wrote: >> Michal Hocko writes: > [...] >> Speaking about long term approach, > > Not really related to the patch but ok (I hope this will not distract > from the original intention here)... > Yes, not directly related to your patch. >> (I'm not really familiar with the history of memory zones code so please >> bear with me if my questions are stupid) >> >> Currently when we online memory blocks we need to know where to put the >> boundary between NORMAL and MOVABLE and this is a very hard decision to >> make, no matter if we do this from kernel or from userspace. In theory, >> we just want to avoid redundant limitations with future unplug but we >> don't really know how much memory we'll need for kernel allocations in >> future. > > yes, and that is why I am not really all that happy about the whole > movable zones concept. It is basically reintroducing highmem issues from > 32b times. But this is the only concept we currently have to provide a > reliable memory hotremove right now. > >> What actually stops us from having the following approach: >> 1) Everything is added to MOVABLE >> 2) When we're out of memory for kernel allocations in NORMAL we 'harvest' >> the first MOVABLE block and 'convert' it to NORMAL. It may happen that >> there is no free pages in this block but it was MOVABLE which means we >> can move all allocations somewhere else. >> 3) Freeing the whole 128mb memblock takes time but we don't need to wait >> till it finishes, we just need to satisfy the currently pending >> allocation and we can continue moving everything else in the background. > > Although it sounds like a good idea at first sight there are many tiny > details which will make it much more complicated. First of all, how > do we know that the lowmem (resp. all zones normal zones) are under > pressure to reduce the movable zone? Getting OOM for ~__GFP_MOVABLE > request? Isn't that too late already? Yes, I was basically thinking about OOM handling. It can also be a sort of watermark-based decision. > Sync migration at that state might > be really non trivial (pages might be dirty, pinned etc...). Non-trivial, yes, but we already have the code to move all allocations away from MOVABLE block when we try to offline it, we can probably leverage it. > What about > user expectation to hotremove that memory later, should we just break > it? How do we inflate movable zone back? I think that it's OK to leave this block non-offlineable for future. As Andrea already pointed out it is not practical to try to guarantee we can unplug everything we plugged in, we're talking about 'best effort' service here anyway. > >> An alternative approach would be to have lists of memblocks which >> constitute ZONE_NORMAL and ZONE_MOVABLE instead of a simple 'NORMAL >> before MOVABLE' rule we have now but I'm not sure this is a viable >> approach with the current code base. > > I am not sure I understand. Now we have [Normal][Normal][Normal][Movable][Movable][Movable] we could have [Normal][Normal][Movable][Normal][Movable][Normal] so when new block comes in we make a decision to which zone we want to online it (based on memory usage in these zones) and zone becomes a list of memblocks which constitute it, not a simple [from..to] range. -- Vitaly