Received: by 2002:a05:6358:701b:b0:131:369:b2a3 with SMTP id 27csp4084668rwo; Mon, 24 Jul 2023 23:47:13 -0700 (PDT) X-Google-Smtp-Source: APBJJlEwan148EfN0+nhMXudRInBFXrigGv5eYTqFGwu8yhwCdN/vK80ib3mZUiWyjO44LNjHG1u X-Received: by 2002:a05:651c:21b:b0:2b6:e128:e7a3 with SMTP id y27-20020a05651c021b00b002b6e128e7a3mr7633258ljn.33.1690267632849; Mon, 24 Jul 2023 23:47:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690267632; cv=none; d=google.com; s=arc-20160816; b=pIIFkpAjs5mkEpIVBNpWk2cI65cLOOVXm4of9pZ5Ho72I77o83PcDx5QtRRqJzVfov DKU4kbOEzDM21jMw6LllLQLSljg3cccuJSkLyUHVj5fMStdMXGZkNjxJU2ZVL7X7F5eZ /8oMmViziErTSi5PuOfeZ0gB4c9FcPx8JtbMVtSCs2fohvlcAvGzYOAvqxE8NiJIbLO1 w1tdhEuRFiT7/psCwyCs6BIsGsGc+CPp/hLyHZb2HFaRQK1N+EYalwkuwKqUnNg+eW8x yD5J5hV1rXxHWbHV8Wuwsj/pGPZOThWXQyxsiZ4oWaymqcfDjPo501KiHFhjJVa3XnKx dSsA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent :content-transfer-encoding:references:in-reply-to:date:cc:to:from :subject:message-id; bh=AScVole8wzxM8yanCI4Oy9jvNiWW+oz08Ge1DxvnXaQ=; fh=0kPhC8u8GhGayfmheRnDxc3WodmOYIKazJBb/cNr03Q=; b=GsktRHDiRwXUnz5U+zI8a7P1zBXbYfYJWFv5h4s39OBsozO/Hyo9VNdtiv3t3/oQYP aRwOIOQO/jkWMMBMSmd5xurOk4MZw3T5hZkAhh+CFMWWMkCl1upzYEIowRNCzHzeeahd NigOQeJJBewIaJ8ZLUW4RdZ4jehJPovHnuVUkbORFnPBVZzvtwVckK8exWBmsE7aARdo uPYSJHq1G7zszjt7NTzgT0IGATgZSESUIhtTFIIB/IGI3QmTwlGqCCW+hMWNxy1ikU0O 6gXkoPykZmSVQzx3J2d8iixHbxhtestajQvvllibP8ZdnFJt48Wapuio1Rw3QEzleDzW PX3w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u9-20020a17090617c900b00992ee06bddbsi7502549eje.176.2023.07.24.23.46.47; Mon, 24 Jul 2023 23:47:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232050AbjGYGR0 convert rfc822-to-8bit (ORCPT + 99 others); Tue, 25 Jul 2023 02:17:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41572 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231640AbjGYGRT (ORCPT ); Tue, 25 Jul 2023 02:17:19 -0400 Received: from gate.crashing.org (gate.crashing.org [63.228.1.57]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id EBC7A1BE6; Mon, 24 Jul 2023 23:17:07 -0700 (PDT) Received: from [IPv6:::1] (localhost.localdomain [127.0.0.1]) by gate.crashing.org (8.14.1/8.14.1) with ESMTP id 36P6Fdmn023967; Tue, 25 Jul 2023 01:15:40 -0500 Message-ID: Subject: Re: VFIO (PCI) and write combine mapping of BARs From: Benjamin Herrenschmidt To: Jason Gunthorpe , Lorenzo Pieralisi Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, alex.williamson@redhat.com, osamaabb@amazon.com, linux-pci@vger.kernel.org, Clint Sbisa , catalin.marinas@arm.com, maz@kernel.org Date: Tue, 25 Jul 2023 16:15:39 +1000 In-Reply-To: References: <2838d716b08c78ed24fdd3fe392e21222ee70067.camel@kernel.crashing.org> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8BIT User-Agent: Evolution 3.44.4-0ubuntu1 MIME-Version: 1.0 X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_PASS,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 2023-07-14 at 09:37 -0300, Jason Gunthorpe wrote: > > There are two topics here > > 1) Make ARM KVM allow the VM to select WC for its MMIO. This has >    evolved in a way that is not related to VFIO > > 2) Allow VFIO to create mmaps with WC for non-VM use cases like DPDK. > > We have a draft patch for #1, and I think a general understanding with > ARM folks that this is the right direction. > > 2 is more like what this email talks about - providing mmaps with > specific flags. > > Benjamin, which are you interested in? Sorry for the delay, got caught up.... The customer request we have (and what I was indeed talking about) is 2. That said, when running in a VM, 2 won't do much without 1. > > > The problem isn't so much the low level implementation, we just have to > > > play with the pgprot, the question is more around what API to present > > > to control this. > > Assuming this is for #2, I think VFIO has fallen into a bit of a trap > by allowing userspace to form the mmap offset. I've seen this happen > in other subsystems too. It seems like a good idea then you realize > you need more stuff in the mmap space and become sad. > > Typically the way out is to covert the mmap offset into a cookie where > userspace issues some ioctl and then the ioctl returns an opaque mmap > offset to use. > > eg in the vfio context you'd do some 'prepare region for mmap' ioctl > where you could specify flags. The kernel would encode the flags in > the cookie and then mmap would do the right thing. Adding more stuff > is done by enhancing the prepare ioctl. > > Legacy mmap offsets are kept working. This indeed what I have in mind. IE. VFIO has legacy regions and add-on regions though the latter is currently only exploited by some drivers that create their own add-on regions. My proposal is to add an ioctl to create them from userspace as "children" of an existing driver-provided region, allowing to set different attributes for mmap. > > > This is still quite specific to PCI, but so is the entire regions > > > mechanism, so I don't see an easy path to something more generic at > > > this stage. > > Regions are general, but the encoding of the mmap cookie has various > PCI semantics when used with the PCI interface.. > > We'd want the same ability with platform devices too, for instance. In the current VFIO the implementation is *entirely* in vfio_pci_core for PCI and entirely in vfio_platform_common.c for platform, so while the same ioctls could be imagined to create sub-regions, it would have to be completely implemented twice unless we do a lot of heavy lifting to move some of that region stuff into common code. But yes, appart from that, no objection :-) Cheers, Ben.