Received: by 2002:a05:7412:419a:b0:f3:1519:9f41 with SMTP id i26csp2950834rdh; Mon, 27 Nov 2023 03:14:29 -0800 (PST) X-Google-Smtp-Source: AGHT+IGZ5HFi9dsmDvafahfP/Vu4e0wEBXZRldbVKTQD5XjN6D2SG/AjcfbYwbIXeyWJlEP8iJGA X-Received: by 2002:a05:6e02:198b:b0:35c:d0b2:2e3 with SMTP id g11-20020a056e02198b00b0035cd0b202e3mr4137920ilf.21.1701083668869; Mon, 27 Nov 2023 03:14:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701083668; cv=none; d=google.com; s=arc-20160816; b=UWe6io9j19h+RaW1W+ejJ1q0ikI+SJpRaFKLYS6xa+1GnqWiN3YZ5RzJu/GzaEGqcj YYMNb2HlKwJstTFc+RnzF/d7YineJJWWdthE1pSAT9ZgyaxPby2I094T4fxSOhPq4V3H +dD5Gsi/qq/kTg7xYbWvZcCTi4BufwWyKaeIDUun67VoiVUowTiR+uMyPinYrE3PH0uM siKNGu5YxW1AP59k4WpOq1IWz4/tbZjopHtvKDCToMSvIMvQiL9G3U5uBVi+syhzLLjF nE2OuaThBzlB80iEBvb4I6d0B3d9ANZkRj8+jYC5uDxaDYia85++LlMpQbnlWqtK6/O4 253Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id; bh=B3gJtIC0S5oBJfNMc5y/ckAyzePwQGzyGSzvq7Qk2uA=; fh=EFnkqe9fn8LGBjO6bm6590UKlEQk94+F7ADhr962Ixs=; b=rdAhq5a854/k46deW4sUhaxTuz/+euUirC2IZBntUKy255v7nWje3Pl1LpZB5g861f GlnPtf+AywEF7bJ8KJmkCggGQZR0IDbEO5qHwlh4y99c9AJ+CgKVuF75Uyjz9Nltcsix hzbtiJnvnT96oLx6It016nC4fs1T7lGHKnvQ7bt08LPkkFFiRF3KN4DDsJduC08/8u+A WtK2HovwRn4sCVUqrIXyc05/5jwiuDMN3LdoIHVwkmdd0FbvIj5QoACuPTE81n9yg4dz eDQBViRgn8uhtdsYXi7+eVij88UrzYfD0cx+iQExM4datszDddtnbYRMjxPy+C5CuQGl 7sUQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from agentk.vger.email (agentk.vger.email. [2620:137:e000::3:2]) by mx.google.com with ESMTPS id be5-20020a656e45000000b005c1f51c706bsi10132157pgb.599.2023.11.27.03.14.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Nov 2023 03:14:28 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) client-ip=2620:137:e000::3:2; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id C81AA80677EF; Mon, 27 Nov 2023 03:14:25 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232695AbjK0LN5 (ORCPT + 99 others); Mon, 27 Nov 2023 06:13:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51510 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232624AbjK0LNz (ORCPT ); Mon, 27 Nov 2023 06:13:55 -0500 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A6E14136; Mon, 27 Nov 2023 03:14:01 -0800 (PST) Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id E414521C6F; Mon, 27 Nov 2023 11:13:59 +0000 (UTC) Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 8C5491367B; Mon, 27 Nov 2023 11:13:59 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id Pc/rIfd5ZGXRZQAAD6G6ig (envelope-from ); Mon, 27 Nov 2023 11:13:59 +0000 Message-ID: <81628606-ca9b-866f-5e71-91001e856871@suse.cz> Date: Mon, 27 Nov 2023 12:13:59 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Subject: Re: [PATCH v13 17/35] KVM: Add transparent hugepage support for dedicated guest memory Content-Language: en-US To: Paolo Bonzini , Sean Christopherson Cc: Xiaoyao Li , Marc Zyngier , Oliver Upton , Huacai Chen , Michael Ellerman , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Alexander Viro , Christian Brauner , "Matthew Wilcox (Oracle)" , Andrew Morton , kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Xu Yilun , Chao Peng , Fuad Tabba , Jarkko Sakkinen , Anish Moorthy , David Matlack , Yu Zhang , Isaku Yamahata , =?UTF-8?B?TWlja2HDq2wgU2FsYcO8?= =?UTF-8?Q?n?= , Vishal Annapurve , Ackerley Tng , Maciej Szmigiero , David Hildenbrand , Quentin Perret , Michael Roth , Wang , Liam Merwick , Isaku Yamahata , "Kirill A . Shutemov" References: <20231027182217.3615211-18-seanjc@google.com> <7c0844d8-6f97-4904-a140-abeabeb552c1@intel.com> <92ba7ddd-2bc8-4a8d-bd67-d6614b21914f@intel.com> <4ca2253d-276f-43c5-8e9f-0ded5d5b2779@redhat.com> From: Vlastimil Babka In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spamd-Bar: +++++++++++++++ Authentication-Results: smtp-out1.suse.de; dkim=none; dmarc=none; spf=softfail (smtp-out1.suse.de: 2a07:de40:b281:104:10:150:64:97 is neither permitted nor denied by domain of vbabka@suse.cz) smtp.mailfrom=vbabka@suse.cz X-Rspamd-Server: rspamd2 X-Spamd-Result: default: False [15.89 / 50.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; SPAMHAUS_XBL(0.00)[2a07:de40:b281:104:10:150:64:97:from]; TO_DN_SOME(0.00)[]; R_SPF_SOFTFAIL(4.60)[~all]; RCVD_COUNT_THREE(0.00)[3]; MX_GOOD(-0.01)[]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(2.20)[]; MIME_TRACE(0.00)[0:+]; MID_RHS_MATCH_FROM(0.00)[]; ARC_NA(0.00)[]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FREEMAIL_ENVRCPT(0.00)[gmail.com]; TAGGED_RCPT(0.00)[]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(1.20)[suse.cz]; NEURAL_SPAM_SHORT(3.00)[1.000]; NEURAL_SPAM_LONG(3.50)[1.000]; RCPT_COUNT_TWELVE(0.00)[44]; FUZZY_BLOCKED(0.00)[rspamd.com]; FREEMAIL_CC(0.00)[intel.com,kernel.org,linux.dev,ellerman.id.au,brainfault.org,sifive.com,dabbelt.com,eecs.berkeley.edu,zeniv.linux.org.uk,infradead.org,linux-foundation.org,vger.kernel.org,lists.infradead.org,lists.linux.dev,lists.ozlabs.org,kvack.org,linux.intel.com,google.com,digikod.net,maciej.szmigiero.name,redhat.com,amd.com,oracle.com,gmail.com]; RCVD_TLS_ALL(0.00)[]; SUSPICIOUS_RECIPS(1.50)[] X-Spam-Score: 15.89 X-Rspamd-Queue-Id: E414521C6F X-Spam: Yes X-Spam-Status: No, score=-2.6 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Mon, 27 Nov 2023 03:14:25 -0800 (PST) On 11/2/23 16:46, Paolo Bonzini wrote: > On Thu, Nov 2, 2023 at 4:38 PM Sean Christopherson wrote: >> Actually, looking that this again, there's not actually a hard dependency on THP. >> A THP-enabled kernel _probably_ gives a higher probability of using hugepages, >> but mostly because THP selects COMPACTION, and I suppose because using THP for >> other allocations reduces overall fragmentation. > > Yes, that's why I didn't even bother enabling it unless THP is > enabled, but it makes even more sense to just try. > >> So rather than honor KVM_GUEST_MEMFD_ALLOW_HUGEPAGE iff THP is enabled, I think >> we should do the below (I verified KVM can create hugepages with THP=n). We'll >> need another capability, but (a) we probably should have that anyways and (b) it >> provides a cleaner path to adding PUD-sized hugepage support in the future. > > I wonder if we need KVM_CAP_GUEST_MEMFD_HUGEPAGE_PMD_SIZE though. This > should be a generic kernel API and in fact the sizes are available in > a not-so-friendly format in /sys/kernel/mm/hugepages. > > We should just add /sys/kernel/mm/hugepages/sizes that contains > "2097152 1073741824" on x86 (only the former if 1G pages are not > supported). > > Plus: is this the best API if we need something else for 1G pages? > > Let's drop *this* patch and proceed incrementally. (Again, this is > what I want to do with this final review: identify places that are > stil sticky, and don't let them block the rest). > > Coincidentially we have an open spot next week at plumbers. Let's > extend Fuad's section to cover more guestmem work. Hi, was there any outcome wrt this one? Based on my experience with THP's it would be best if userspace didn't have to opt-in, nor care about the supported size. If the given size is unaligned, provide a mix of large pages up to an aligned size, and for the rest fallback to base pages, which should be better than -EINVAL on creation (is it possible with the current implementation? I'd hope so so?). A way to opt-out from huge pages could be useful although there's always the risk of some initial troubles resulting in various online sources cargo-cult recommending to opt-out forever. Vlastimil