Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp6483116pxb; Wed, 17 Feb 2021 05:55:28 -0800 (PST) X-Google-Smtp-Source: ABdhPJxM05cfGX0JOoJEDmIeCIB6k9j6pwdeygmwphr6eJ5UMN6HaIyK0ZFqY7fBwZ7/k/NxYYcq X-Received: by 2002:a17:906:2bce:: with SMTP id n14mr24845691ejg.171.1613570127839; Wed, 17 Feb 2021 05:55:27 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1613570127; cv=none; d=google.com; s=arc-20160816; b=i1tHv9fl/1ZbTNn1oUzzWQvAQVjL5BR5aJSEFqDjOAbANZ1LxppCwASLgoVR5/vPXo +Qluz/O4FqOHFIEEJd87yIgFkqdv6WTn1tED20EFQ4lNCPr9NYLREyID5rI6CJ+dKv24 JDeIlSE/hmKfMGnvKIfPYVWmKX9Sw4RhBE7zqjUkC93ggEBt8elgY7iPU2OiviOWrI4O EsFES1IfC7HlaX88mD617b0pJC8byCsM57OuoXOeiOc6iJc3RhqU64q1baWqJmok2kDB 7QISTRHe+pti0P7LiDcSegEIeZ/IIGCW0+UYqbOnlhWbKFJMk+Mh7ecUN4horRXAr/E2 C3MA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=LJwaVsNSdE6pGCu42lC4Cr77EcHivp6g9WAAtymIv4g=; b=HWfG5LD2jVE0xdGXd6nsSyqNBt7VhYS8Mx7CrCHzN2hxD+fynIAAOcw/l1DCZ6hG66 rYqfidMsbiWyPbwplm95G5hB5QthksldF+MnMQ8TlgGYjsOjQlOAUxnN9xcEU8gmP2Qc b5d4nZSNc7SXROmsgkhlcKqp+j9WrHNzbhKqattfv2BHH97Zz7pqT1wnF3N3CPCEHl5K 2nVtdXRyTPP1J/FFWprria8Ahrge220mnxKXkHdFIBxeRpMDayFkqIFK+ybTP4/4u+5b QOjmY4aJLEiB72Hdnl8KOkAbKIEMckMK3lO8yrPAZMPpS8MlZt9NtErieufPeoQHacH3 8Dmg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=s6Gwlmut; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=NONE dis=NONE) header.from=suse.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id d11si1343955edz.34.2021.02.17.05.55.04; Wed, 17 Feb 2021 05:55:27 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=s6Gwlmut; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=NONE dis=NONE) header.from=suse.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232869AbhBQNvR (ORCPT + 99 others); Wed, 17 Feb 2021 08:51:17 -0500 Received: from mx2.suse.de ([195.135.220.15]:50274 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232069AbhBQNvQ (ORCPT ); Wed, 17 Feb 2021 08:51:16 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1613569828; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=LJwaVsNSdE6pGCu42lC4Cr77EcHivp6g9WAAtymIv4g=; b=s6GwlmutP0OO1buPuwUcMPcNUBshhtdznCjLE2fl6CjSgq2EADJM6PWgLhIXF0pSKvGqxU nvPilNP17O3xu7/0/R+bs2j56CNYJv+FU8lYwYKYP9LmVXT/XzI+SkEmgZYAHZnoCAal3n qhZPgs5jRhW0bfMqyfbR7OmkJb3JH94= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id CE58EB761; Wed, 17 Feb 2021 13:50:28 +0000 (UTC) Date: Wed, 17 Feb 2021 14:50:26 +0100 From: Michal Hocko To: David Hildenbrand Cc: Oscar Salvador , Andrew Morton , Mike Kravetz , Muchun Song , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 1/2] mm: Make alloc_contig_range handle free hugetlb pages Message-ID: References: <20210217100816.28860-1-osalvador@suse.de> <20210217100816.28860-2-osalvador@suse.de> <182f6a4a-6f95-9911-7730-8718ab72ece2@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <182f6a4a-6f95-9911-7730-8718ab72ece2@redhat.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed 17-02-21 14:36:47, David Hildenbrand wrote: > On 17.02.21 14:30, Michal Hocko wrote: > > On Wed 17-02-21 11:08:15, Oscar Salvador wrote: > > > Free hugetlb pages are tricky to handle so as to no userspace application > > > notices disruption, we need to replace the current free hugepage with > > > a new one. > > > > > > In order to do that, a new function called alloc_and_dissolve_huge_page > > > is introduced. > > > This function will first try to get a new fresh hugetlb page, and if it > > > succeeds, it will dissolve the old one. > > > > > > With regard to the allocation, since we do not know whether the old page > > > was allocated on a specific node on request, the node the old page belongs > > > to will be tried first, and then we will fallback to all nodes containing > > > memory (N_MEMORY). > > > > I do not think fallback to a different zone is ok. If yes then this > > really requires a very good reasoning. alloc_contig_range is an > > optimistic allocation interface at best and it shouldn't break carefully > > node aware preallocation done by administrator. > > What does memory offlining do when migrating in-use hugetlbfs pages? Does it > always keep the node? No it will break the node pool. The reasoning behind that is that offlining is an explicit request from the userspace and it is expected to break affinities because it is a destructive action from the memory capacity point of view. It is impossible to have former affinity while you are cutting the memory off under its user. > I think keeping the node is the easiest/simplest approach for now. > > > > > > Note that gigantic hugetlb pages are fenced off since there is a cyclic > > > dependency between them and alloc_contig_range. > > > > Why do we need/want to do all this in the first place? > > cma and virtio-mem (especially on ZONE_MOVABLE) really want to handle > hugetlbfs pages. Do we have any real life examples? Or does this fall more into, let's optimize an existing implementation category. -- Michal Hocko SUSE Labs