Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp4165508pxk; Tue, 8 Sep 2020 12:21:17 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzaDTmBAz9XkPXVbpHZTj8Cpq896WASdV6WBzhLia/Pmngumb3nr/bfF866fcHivF2G/QJw X-Received: by 2002:a17:906:8695:: with SMTP id g21mr26435577ejx.504.1599592877478; Tue, 08 Sep 2020 12:21:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1599592877; cv=none; d=google.com; s=arc-20160816; b=CSnLq3ODXYlgDDteBOIYaJ1gmtF5+fiCI7K20ZZDviXLx8on3YMapvZ5gVq+nhOFIg XNBg2IoJLciWxyD/gxtW8uD/GC4ElTQBhpv3x9vsiVt7H1I9wyh8QZrKNTuvg7Zdmwva cRdjwavfu75giIFlYXH6MlVt8TRfIRj83IzJHygItI83tnnrir0/51kTypv7m8sIiZv+ 0ZQMp8GS7qBeJqMV0IaCK0ByeV2ViGA4h+MeY47sUtaEbD2irD351uY9gN7Kx7ty9Ns7 fIDvfml3ku7SOvShaQfKqlqCyRQakq6D0eUgPoumXe8ARolM1inCtdzufvQHOKfNn4x/ vvAQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=Nli95LqtcoLP7J5CO7nMUny8Zy0mzeKUd/d/IxxLeJM=; b=KaV/5wvhIuSlb0HcUI7T5hQ4CTTeJ4YG2Q2A3thpIqcMMrs2q7+/F9SfgHtUzchno/ 5d9GT0VVFK099lxGdJwAEsYnN8JxVWfkGse/D6DddPFeXNwhIg1GQ22WKiabRzSO3wJ2 PSQEWtMGnJveAzmwqhj+61pQEi07e1Gar8VQ0+waVapkVG7wgGX6B+y+DuqRtsAhL4HL 0pUy3MYNxRX4pzmFieV+UXhKx+XDEyYmB3mB6YiP4zEjoiDCKc1TWhvduhxplFFoby5g o5XDeg3ASp2ltLVO3c+LYobRBsXDRstqSsghprWJT/YK1/zU04xJrwm5wOT4aez8mz54 p+6w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id qq6si12394229ejb.100.2020.09.08.12.20.54; Tue, 08 Sep 2020 12:21:17 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731713AbgIHTRE (ORCPT + 99 others); Tue, 8 Sep 2020 15:17:04 -0400 Received: from mx2.suse.de ([195.135.220.15]:52934 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731165AbgIHQDm (ORCPT ); Tue, 8 Sep 2020 12:03:42 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 779DCB6AC; Tue, 8 Sep 2020 14:35:06 +0000 (UTC) Date: Tue, 8 Sep 2020 16:35:03 +0200 From: Michal Hocko To: Zi Yan Cc: David Hildenbrand , Roman Gushchin , "Kirill A. Shutemov" , linux-mm@kvack.org, Rik van Riel , "Kirill A . Shutemov" , Matthew Wilcox , Shakeel Butt , Yang Shi , David Nellans , linux-kernel@vger.kernel.org Subject: Re: [RFC PATCH 00/16] 1GB THP support on x86_64 Message-ID: <20200908143503.GE26850@dhcp22.suse.cz> References: <20200902180628.4052244-1-zi.yan@sent.com> <20200903142300.bjq2um5y5nwocvar@box> <20200903163020.GG60440@carbon.dhcp.thefacebook.com> <8e677ead-206d-08dd-d73e-569bd3803e3b@redhat.com> <7E20392E-5ED7-4C22-9555-F3BAABF3CBE9@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <7E20392E-5ED7-4C22-9555-F3BAABF3CBE9@nvidia.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 08-09-20 10:05:11, Zi Yan wrote: > On 8 Sep 2020, at 7:57, David Hildenbrand wrote: > > > On 03.09.20 18:30, Roman Gushchin wrote: > >> On Thu, Sep 03, 2020 at 05:23:00PM +0300, Kirill A. Shutemov wrote: > >>> On Wed, Sep 02, 2020 at 02:06:12PM -0400, Zi Yan wrote: > >>>> From: Zi Yan > >>>> > >>>> Hi all, > >>>> > >>>> This patchset adds support for 1GB THP on x86_64. It is on top of > >>>> v5.9-rc2-mmots-2020-08-25-21-13. > >>>> > >>>> 1GB THP is more flexible for reducing translation overhead and increasing the > >>>> performance of applications with large memory footprint without application > >>>> changes compared to hugetlb. > >>> > >>> This statement needs a lot of justification. I don't see 1GB THP as viable > >>> for any workload. Opportunistic 1GB allocation is very questionable > >>> strategy. > >> > >> Hello, Kirill! > >> > >> I share your skepticism about opportunistic 1 GB allocations, however it might be useful > >> if backed by an madvise() annotations from userspace application. In this case, > >> 1 GB THPs might be an alternative to 1 GB hugetlbfs pages, but with a more convenient > >> interface. > > > > I have concerns if we would silently use 1~GB THPs in most scenarios > > where be would have used 2~MB THP. I'd appreciate a trigger to > > explicitly enable that - MADV_HUGEPAGE is not sufficient because some > > applications relying on that assume that the THP size will be 2~MB > > (especially, if you want sparse, large VMAs). > > This patchset is not intended to silently use 1GB THP in place of 2MB THP. > First of all, there is a knob /sys/kernel/mm/transparent_hugepage/enable_1GB > to enable 1GB THP explicitly. Also, 1GB THP is allocated from a reserved CMA > region (although I had alloc_contig_pages as a fallback, which can be removed > in next version), so users need to add hugepage_cma=nG kernel parameter to > enable 1GB THP allocation. If a finer control is necessary, we can add > a new MADV_HUGEPAGE_1GB for 1GB THP. A global knob is insufficient. 1G pages will become a very precious resource as it requires a pre-allocation (reservation). So it really has to be an opt-in and the question is whether there is also some sort of access control needed. -- Michal Hocko SUSE Labs