Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp1099122rdb; Wed, 6 Dec 2023 08:32:23 -0800 (PST) X-Google-Smtp-Source: AGHT+IEb8Yx7DgWlTFjKEgj3TVB3v5CeNTKyYKovZswp4ekJ0kVvV3al5lqiY3Ws3l9AR/0xklrA X-Received: by 2002:a05:6a20:cea7:b0:18c:3ec:5ad5 with SMTP id if39-20020a056a20cea700b0018c03ec5ad5mr1208937pzb.57.1701880343441; Wed, 06 Dec 2023 08:32:23 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701880343; cv=none; d=google.com; s=arc-20160816; b=SHYdV/Dqj1CxgIUSDCR4JWQ8H5rOB0rPq5kxcHmfyrSfT54WGQFyPOqAOVQ467uDZj tUMuy9lplrKdOQ2F5ODK2SZKKibhUeVFcpcePNyj3O4WAjAl0iUnW/0pTXlbrSJmx/R3 Ccz39Me+aV+U6jI71xEehfr6bCRj0afMvYsIriBjHsVTbVKQrECSaanr+b7YjjM7R3jr t59DPLDHP2bC1F1luhUfTqjCnwKpIaCMoYlehIpbucilR1diQBMHhq3axq7jzMAErC+T 8Cv8F1V+80Whb6pL+lnEsIGNe3kMSzJ/966Q0svTaS3XrAwxjSCk2loQF+6o3X5grJAp zVCQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=ZoR9d7jThI6IkLpr/zO0HI8ySal5KqaLVWUK5vp9MEI=; fh=bEyFLQknfHvZVRjPYONukTUGOd/Q5gJXTq+COzby4r4=; b=R7O7FujJIihZqziEAwNtXBVRKVy4W9uCfx0GejcdhRP1vhNecs/rBSYm1zcT9HO6C7 PFZh+MCw8iiRQCrbsFN7BlSJHrk+/pdy/V82RCsB5O0MoV1leVmI5MOeC+4OSfxU6KPk NB9/EMYcslNLorQp+nm59YR2BO8791r9RyVhYo0ehMRdeNVV1ZoFnRqKVSRPnsL74NXm 95CQs/cvwrM3maDe8XlwXLleyBVp6TK+e+sctbKGqgde480y6seWhytdkPXXBeQJivI1 /j+tRLMbORczWsiJ/bOmRDF/sugEnxntys1P23xAuaUN9kidteLRE1Tucd8EPTeuuqBu 5MPw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from morse.vger.email (morse.vger.email. [2620:137:e000::3:1]) by mx.google.com with ESMTPS id f22-20020a635556000000b005b92edaa151si133369pgm.739.2023.12.06.08.32.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Dec 2023 08:32:23 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) client-ip=2620:137:e000::3:1; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by morse.vger.email (Postfix) with ESMTP id 408A2826F7F5; Wed, 6 Dec 2023 08:32:11 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at morse.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1378466AbjLFQbv (ORCPT + 99 others); Wed, 6 Dec 2023 11:31:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37172 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1378351AbjLFQbu (ORCPT ); Wed, 6 Dec 2023 11:31:50 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B22E1D47 for ; Wed, 6 Dec 2023 08:31:55 -0800 (PST) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 942DDC433CC; Wed, 6 Dec 2023 16:31:50 +0000 (UTC) Date: Wed, 6 Dec 2023 16:31:48 +0000 From: Catalin Marinas To: Jason Gunthorpe Cc: Marc Zyngier , ankita@nvidia.com, Shameerali Kolothum Thodi , oliver.upton@linux.dev, suzuki.poulose@arm.com, yuzenghui@huawei.com, will@kernel.org, ardb@kernel.org, akpm@linux-foundation.org, gshan@redhat.com, aniketa@nvidia.com, cjia@nvidia.com, kwankhede@nvidia.com, targupta@nvidia.com, vsethi@nvidia.com, acurrid@nvidia.com, apopple@nvidia.com, jhubbard@nvidia.com, danw@nvidia.com, mochs@nvidia.com, kvmarm@lists.linux.dev, kvm@vger.kernel.org, lpieralisi@kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: Re: [PATCH v2 1/1] KVM: arm64: allow the VM to select DEVICE_* and NORMAL_NC for IO memory Message-ID: References: <20231205130517.GD2692119@nvidia.com> <20231205164318.GG2692119@nvidia.com> <86bkb4bn2v.wl-maz@kernel.org> <86a5qobkt8.wl-maz@kernel.org> <868r67blwo.wl-maz@kernel.org> <20231206151603.GR2692119@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20231206151603.GR2692119@nvidia.com> X-Spam-Status: No, score=-0.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on morse.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (morse.vger.email [0.0.0.0]); Wed, 06 Dec 2023 08:32:11 -0800 (PST) On Wed, Dec 06, 2023 at 11:16:03AM -0400, Jason Gunthorpe wrote: > On Wed, Dec 06, 2023 at 12:14:18PM +0000, Catalin Marinas wrote: > > We could do with a pgprot_maybewritecombine() or > > pgprot_writecombinenospec() (similar to Jason's idea but without > > changing the semantics of pgprot_device()). For the user mapping on > > arm64 this would be Device (even _GRE) since it can't disable > > speculation but stage 2 would leave the decision to the guest since the > > speculative loads aren't much different from committed loads done > > wrongly. > > This would be fine, as would a VMA flag. Please pick one :) > > I think a VMA flag is simpler than messing with pgprot. I guess one could write a patch and see how it goes ;). > > If we want the VMM to drive this entirely, we could add a new mmap() > > flag like MAP_WRITECOMBINE or PROT_WRITECOMBINE. They do feel a bit > > As in the other thread, we cannot unconditionally map NORMAL_NC into > the VMM. I'm not suggesting this but rather the VMM map portions of the BAR with either Device or Normal-NC, concatenate them (MAP_FIXED) and pass this range as a memory slot (or multiple if a slot doesn't allow multiple vmas). > > The latter has some benefits for DPDK but it's a lot more involved > > with > > DPDK WC support will be solved with some VFIO-only change if anyone > ever cares to make it, if that is what you mean. Yeah. Some arguments I've heard in private and public discussions is that the KVM device pass-through shouldn't be different from the DPDK case. So fixing that would cover KVM as well, though we'd need additional logic in the VMM. BenH had a short talk at Plumbers around this - https://youtu.be/QLvN3KXCn0k?t=7010. There was some statement in there that for x86, the guests are allowed to do WC without other KVM restrictions (not sure whether that's the case, not familiar with it). > > having to add device-specific knowledge into the VMM. The VMM would also > > have to present the whole BAR contiguously to the guest even if there > > are different mapping attributes within the range. So a lot of MAP_FIXED > > uses. I'd rather leaving this decision with the guest than the VMM, it > > looks like more hassle to create those mappings. The VMM or the VFIO > > could only state write-combine and speculation allowed. > > We talked about this already, the guest must decide, the VMM doesn't > have the information to pre-predict which pages the guest will want to > use WC on. Are the Device/Normal offsets within a BAR fixed, documented in e.g. the spec or this is something configurable via some MMIO that the guest does. -- Catalin