Received: by 2002:ab2:6309:0:b0:1fb:d597:ff75 with SMTP id s9csp275087lqt; Thu, 6 Jun 2024 03:17:55 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXBTYOU/7kF63IOyN+PyDkIJaXnch4RZA/HRduB9h3y5ereMMZPvOHDrr4uu/+MlVOWscQhp1n8JjLik7qJgl9S2S4RHsVSVSPrPt9QoA== X-Google-Smtp-Source: AGHT+IHpZ/MlPpiVBB/LhDXHiL9YGV1W/nVHzjuhMk8lsh+qCuEGKBg6o3DBF0zZc9bJiiOjmV0M X-Received: by 2002:a05:6a20:7493:b0:1b0:3190:96d6 with SMTP id adf61e73a8af0-1b2b703cf19mr6236596637.34.1717669075151; Thu, 06 Jun 2024 03:17:55 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1717669074; cv=pass; d=google.com; s=arc-20160816; b=u6/7gdo2+rpC8x/a9PxIgRXq8QeSoSzJmxhkQ+5bvvkEEOjDtwk5idNktOom2oOymQ MCeT88yLL9El4ciwHPZSkLMC+WKdd01jHvPUxDiX12IKmZw0DYQR18Riw77hyiY6vUPt /dAoGtxaKTfkPOP7esjbegL2/Blp07n3ql65mLyRwyI3w/MgdcIrRKp0KrWKqFa5kEjV 0NbD8UsS8WF9rUXr6QkFsEd766QwCRONeNJvwtX0JtiTogF8QZfsNZzmvhtaUvn2XJ2T AZHZI84NR/TQcWXL8aUEc4giDhf8AKI2QlnbHvwKYk7fBJpjQgHTilNGn118aT1P86WL BlXA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:list-unsubscribe:list-subscribe:list-id:precedence :user-agent:references:in-reply-to:subject:cc:to:from:message-id :date:dkim-signature; bh=rUYFft4D7fSrC5UaMPahyWWFe9NJq0sp2R1N1gQ/wqY=; fh=g7bh5YGiET2RVZgEmKoDcORu2oom3mPGvAIYP45kx+I=; b=Yil8fm145bpip7GfwuM+foeEbZR89D6z71C1N1HBLKw3xlDwVi1ogtw0jwA4Xxn9wa i8f4h5QHGisKM6rvkBtCYsxK/7yXk7JLgHXW3nLy62kZ51QkneD8R39ghDwhECF4VI31 jH+UO9Qfnkv9drVadSIWC4Y5bDVPtTWMOpqsXREz5q+kg5YFfIVbbQpcXD0V8lhrNxlj eeHJtK+H/+q9XWc1kwyYNAhKsQsDXzaTwjLKQegrAWGDPmq7rvl0+r1QBXaG8zwN5iQW 4rzl4UN+H+qr1LmZXbqvgLeQD57sBUp/zn+sVuUUf/TdHgrLmOgZ+buhCraZGv/lPzw1 yyXw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=uiEWhhEz; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-204095-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-204095-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id 41be03b00d2f7-6de262c061dsi933422a12.319.2024.06.06.03.17.54 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Jun 2024 03:17:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-204095-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=uiEWhhEz; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-204095-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-204095-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 150CA283AA5 for ; Thu, 6 Jun 2024 10:17:54 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 6D2E7194C7D; Thu, 6 Jun 2024 10:17:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="uiEWhhEz" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5B873158848; Thu, 6 Jun 2024 10:17:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717669060; cv=none; b=FV3/lPzwqTvHWWXqqSmh51ImAbajYrnBfPn++1ZgUweBk0kuAlf38cknp7PRPQSxHScZ88tEOLgaSW/kNSHDysIudbsTd/pEY10X5n185VXmr4wCvqqSqyDkmvACO5WapIRLzshiAtuenHqFkXbiJqhr70gu5BB/1INEhXOhgQk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717669060; c=relaxed/simple; bh=W9kbQ3pOCWummTxiokP7SZcsLeHBrYY6Qzq/MV72KTU=; h=Date:Message-ID:From:To:Cc:Subject:In-Reply-To:References: MIME-Version:Content-Type; b=Ll5e/OPt2kAbbBDKs6l2nqrJiWbc0ASdP7a8Eki37HA2vrxdkiL+Fm0/7lfXRkeePgYbRGY+r3lzZX6OYiaKOE1aqjwzufpgipqPLZzP7pybbmhI/bpvQnW7+ZtbrWhfb5V0xEyE1Jaiczai9LoOXeGK1YzjYMviJEgIUmLhsEI= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=uiEWhhEz; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id D96D3C32782; Thu, 6 Jun 2024 10:17:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1717669059; bh=W9kbQ3pOCWummTxiokP7SZcsLeHBrYY6Qzq/MV72KTU=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=uiEWhhEzoZzCIGqYIZyzZELj/haUQGSJYnc/PlYcDgqUSC+04XGE+M+Tzy36Jqpso I4B8nKfrhcD+XTJPPvcyuINb4gGoyzJipMtkwRp7X0fqGZw7thxFDzT+qaAd6DOc9T QBvq9FMfIsHUTsjV38OTk1T8r1ewNiaQYBW/7WFHcCJfVMrArKP4FF2eUjYCCgPsA1 6Tjjgrm9HSl+IZVOcW3vsUHWvSXPOXJA894X2JupawEqiGdJ2N8+LVKJqegLeNFow3 7R9xEkjRyNPAxeI84GiTRczFdUBT8YJCJnuGxyakUMHsf4RiuMWtgfuQ+vATZ5VKwV C39dwBzegKSSQ== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1sFABd-001G9O-CB; Thu, 06 Jun 2024 11:17:37 +0100 Date: Thu, 06 Jun 2024 11:17:36 +0100 Message-ID: <867cf2l6in.wl-maz@kernel.org> From: Marc Zyngier To: Steven Price Cc: kvm@vger.kernel.org, kvmarm@lists.linux.dev, Catalin Marinas , Will Deacon , James Morse , Oliver Upton , Suzuki K Poulose , Zenghui Yu , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Joey Gouly , Alexandru Elisei , Christoffer Dall , Fuad Tabba , linux-coco@lists.linux.dev, Ganapatrao Kulkarni Subject: Re: [PATCH v3 12/14] arm64: realm: Support nonsecure ITS emulation shared In-Reply-To: <4c363476-e5b5-42ff-9f30-a02a92b6751b@arm.com> References: <20240605093006.145492-1-steven.price@arm.com> <20240605093006.145492-13-steven.price@arm.com> <86a5jzld9g.wl-maz@kernel.org> <4c363476-e5b5-42ff-9f30-a02a92b6751b@arm.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/29.2 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: steven.price@arm.com, kvm@vger.kernel.org, kvmarm@lists.linux.dev, catalin.marinas@arm.com, will@kernel.org, james.morse@arm.com, oliver.upton@linux.dev, suzuki.poulose@arm.com, yuzenghui@huawei.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, joey.gouly@arm.com, alexandru.elisei@arm.com, christoffer.dall@arm.com, tabba@google.com, linux-coco@lists.linux.dev, gankulkarni@os.amperecomputing.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false On Wed, 05 Jun 2024 16:08:49 +0100, Steven Price wrote: > > Hi Marc, > > On 05/06/2024 14:39, Marc Zyngier wrote: > > The subject line is... odd. I'd expect something like: > > > > "irqchip/gic-v3-its: Share ITS tables with a non-trusted hypervisor" > > > > because nothing here should be CCA specific. > > Good point - that's a much better subject. > > > On Wed, 05 Jun 2024 10:30:04 +0100, > > Steven Price wrote: > >> > >> Within a realm guest the ITS is emulated by the host. This means the > >> allocations must have been made available to the host by a call to > >> set_memory_decrypted(). Introduce an allocation function which performs > >> this extra call. > > > > This doesn't mention that this patch radically changes the allocation > > of some tables. > > I guess that depends on your definition of radical, see below. It's election time, I'm all about making bold statements! [...] > >> @@ -3334,8 +3365,9 @@ static bool its_alloc_table_entry(struct its_node *its, > >> > >> /* Allocate memory for 2nd level table */ > >> if (!table[idx]) { > >> - page = alloc_pages_node(its->numa_node, GFP_KERNEL | __GFP_ZERO, > >> - get_order(baser->psz)); > >> + page = its_alloc_pages_node(its->numa_node, > >> + GFP_KERNEL | __GFP_ZERO, > >> + get_order(baser->psz)); > >> if (!page) > >> return false; > >> > >> @@ -3418,7 +3450,9 @@ static struct its_device *its_create_device(struct its_node *its, u32 dev_id, > >> unsigned long *lpi_map = NULL; > >> unsigned long flags; > >> u16 *col_map = NULL; > >> + struct page *page; > >> void *itt; > >> + int itt_order; > >> int lpi_base; > >> int nr_lpis; > >> int nr_ites; > >> @@ -3430,7 +3464,6 @@ static struct its_device *its_create_device(struct its_node *its, u32 dev_id, > >> if (WARN_ON(!is_power_of_2(nvecs))) > >> nvecs = roundup_pow_of_two(nvecs); > >> > >> - dev = kzalloc(sizeof(*dev), GFP_KERNEL); > >> /* > >> * Even if the device wants a single LPI, the ITT must be > >> * sized as a power of two (and you need at least one bit...). > >> @@ -3438,7 +3471,16 @@ static struct its_device *its_create_device(struct its_node *its, u32 dev_id, > >> nr_ites = max(2, nvecs); > >> sz = nr_ites * (FIELD_GET(GITS_TYPER_ITT_ENTRY_SIZE, its->typer) + 1); > >> sz = max(sz, ITS_ITT_ALIGN) + ITS_ITT_ALIGN - 1; > >> - itt = kzalloc_node(sz, GFP_KERNEL, its->numa_node); > >> + itt_order = get_order(sz); > >> + page = its_alloc_pages_node(its->numa_node, > >> + GFP_KERNEL | __GFP_ZERO, > >> + itt_order); > > > > So we go from an allocation that was so far measured in *bytes* to > > something that is now at least a page. Per device. This seems a bit > > excessive to me, specially when it isn't conditioned on anything and > > is now imposed on all platforms, including the non-CCA systems (which > > are exactly 100% of the machines). > > Catalin asked about this in v2: > https://lore.kernel.org/lkml/c329ae18-2b61-4851-8d6a-9e691a2007c8@arm.com/ > > To be honest, I don't have a great handle on how much memory is being > wasted here. Within the realm guest I was testing this is rounding up an > otherwise 511 byte allocation to a 4k page, and there are 3 of them. > Which seems reasonable from a realm guest perspective. And not that reasonable on a smaller system, such as my own router VM that has a whole lot of devices and very little memory. Not to mention that while CCA is stuck with 4k pages (duh!), the world is moving towards larger pages, meaning that this is wasting even more memory. > > I can see two options to improve here: > > 1. Add a !is_realm_world() check and return to the previous behaviour > when not running in a realm. It's ugly, and doesn't deal with any other > potential future memory encryption. cc_platform_has(CC_ATTR_MEM_ENCRYPT) > might be preferable? But this means no impact to non-realm guests. No, this is way too ugly, and doesn't help with things like pKVM. > > 2. Use a special (global) memory allocator that does the > set_memory_decrypted() dance on the pages that it allocates but allows > packing the allocations. I'm not aware of an existing kernel API for > this, so it's potentially quite a bit of code. The benefit is that it > reduces memory consumption in a realm guest, although fragmentation > still means we're likely to see a (small) growth. > > Any thoughts on what you think would be best? I would expect that something similar to kmem_cache could be of help, only with the ability to deal with variable object sizes (in this case: minimum of 256 bytes, in increments defined by the implementation, and with a 256 byte alignment). I don't think the ITS is particularly special here, and we should come up with something that is generic enough to support sharing of non-page-sized objects. Thanks, M. -- Without deviation from the norm, progress is not possible.