Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp4992751pxj; Wed, 9 Jun 2021 06:55:16 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxMvH2AhIAgW8QmniUIBfkfMvG/CFRJampsaUeCX+C7Dn5oo8fI0bsr7pkEyHdxrjWKIgRT X-Received: by 2002:a17:907:20da:: with SMTP id qq26mr13863ejb.42.1623246916178; Wed, 09 Jun 2021 06:55:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1623246916; cv=none; d=google.com; s=arc-20160816; b=IrRddL+9ddXjWTACZKknYvHQUg4aTLgDNryu5dQ9Gque8LK1l80js1g7sOE+Taqnsm za1Wbt4vE/u1LTPWaBbfhPvas4RgcphjjhOXFfwEo6/jjAp91u4w7q50+8CohDNWRMee Iwi89/bPGZ/qxzfXbSopx3saxCCkIxK1+Tx014iZ0yLQ/83D8Imk9ihx+4TZAnkt16Dq Varlqwq6/RuT9ZjDnZqI4kvrSRUE3Pt5RIEnqQ/hgCdgncwD7p3u7UpxG3vrtcpEa/9L 2gEziM0nllgj4BXNlSr9Crp/xW33ciYoSArJNF/aBtpSX/behORqAbDCW5ZyqO0atumv rcEQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-language:content-transfer-encoding :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject:reply-to:dkim-signature; bh=BJdAIhBhaJHDTxxgy9Ey5Eqpozz7WMzGVnuX/u268J8=; b=IaxEfjTFtMz1dBIE1UP+CVtzx4hYmuXhWNC7WJcSbyp0uskUF25owe/Qo8Lh5omWZG tkUVQCbbuWD+VQjsk5szKLwE+HTiKdXvcpIx0QoEdHHqJxVizFhcYdbDMYwtP+HDsyB+ qTN4nSTE9YFPp9bdV41vjq0jgpg9FNx5+u8yyUjIKfZ8ql/vuvBl5WDDKL33/SXbP2kj banbpbKH9jf02kp/tGjcOehsjtd4EnwDDIvas2dtgtYE0ddH2NDH9SeprPgoSJkNdscu zBfTJplfStTebdJ9NG+Lx1W8tb931jc02i9o6uPQmV2BJvnkLZn1UIXa/vnCtdUNuBN/ GQOA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=B4WmiwIX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id m25si2316186ejb.570.2021.06.09.06.54.47; Wed, 09 Jun 2021 06:55:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=B4WmiwIX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237442AbhFIIQt (ORCPT + 99 others); Wed, 9 Jun 2021 04:16:49 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:48244 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237285AbhFIIQs (ORCPT ); Wed, 9 Jun 2021 04:16:48 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1623226494; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BJdAIhBhaJHDTxxgy9Ey5Eqpozz7WMzGVnuX/u268J8=; b=B4WmiwIXLq2Ut0hykR2DN0UkrBmgsPK5hpC+3poG+rM862icUCA9wstjy9hCS3h2Q54xO1 QdkRxFl79toWPX4aQGat22StmDt/KTnmNgQsNxTlqCxDEWSdCZ3mC8vmw8RheGybJgIMxx bDfwFU6qPJ3YijwXt/h2LULkTCbzamQ= Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-251-j8m_qZQNMma8EOVdAFZA5A-1; Wed, 09 Jun 2021 04:14:52 -0400 X-MC-Unique: j8m_qZQNMma8EOVdAFZA5A-1 Received: by mail-wm1-f71.google.com with SMTP id g14-20020a05600c4eceb02901b609849650so1796132wmq.6 for ; Wed, 09 Jun 2021 01:14:52 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:reply-to:subject:to:cc:references:from :message-id:date:user-agent:mime-version:in-reply-to :content-transfer-encoding:content-language; bh=BJdAIhBhaJHDTxxgy9Ey5Eqpozz7WMzGVnuX/u268J8=; b=EkfWpYmTwkvEAwV9HiXIvFixRvpvV3QuNqlhSeCxUMn3cUgXZq1gv8ZyV1SCabvWS+ iO5cWZb/VC6f06cyrSULHN8147KAqoixgp3VeURYW3jaq2VQ1ISABQABU/y1TH+AYQkS SuFHjHZUufu9Jelf23gcf+AaBO5aoAgVgGjEJ3Dlgkspj/GWspu446I2Rl8Z4ihWQwWL SO4izDouGOnDxKGjhite9GrquFGF3FZZPD4Ro1oBbL6Lunk9Egi9wAdU86KXTAhmkjE2 cJog4yFK6W+tF4LM51qDzdYX4Ktf7Nd+hNWxOMvXDVauVE6rstBkFguiCGqQYkO3YETI ZFPQ== X-Gm-Message-State: AOAM530NsMVCUJvXN51U86mxJ2T+M4mZYHcWe87GBqyShoG3UdBImV3t et0I8SI2OUT2iNZem+31iBG8Y/TZkYkLAtHveMowUtUzq/jQw3R9Y2IERdvkqIEwm95deApNmfP sSyrY2JdkXxBH4GKW8xh4pGNA X-Received: by 2002:a1c:5452:: with SMTP id p18mr25986470wmi.176.1623226491776; Wed, 09 Jun 2021 01:14:51 -0700 (PDT) X-Received: by 2002:a1c:5452:: with SMTP id p18mr25986454wmi.176.1623226491547; Wed, 09 Jun 2021 01:14:51 -0700 (PDT) Received: from ?IPv6:2a01:e0a:59e:9d80:527b:9dff:feef:3874? ([2a01:e0a:59e:9d80:527b:9dff:feef:3874]) by smtp.gmail.com with ESMTPSA id w13sm24559323wrc.31.2021.06.09.01.14.49 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 09 Jun 2021 01:14:50 -0700 (PDT) Reply-To: eric.auger@redhat.com Subject: Re: Plan for /dev/ioasid RFC v2 To: "Tian, Kevin" , Jason Gunthorpe , "Alex Williamson (alex.williamson@redhat.com)" , Jean-Philippe Brucker , David Gibson , Jason Wang , "parav@mellanox.com" , "Enrico Weigelt, metux IT consult" , Paolo Bonzini , Shenming Lu Cc: Jonathan Corbet , "Raj, Ashok" , "Liu, Yi L" , "Wu, Hao" , "Jiang, Dave" , Jacob Pan , Kirti Wankhede , Robin Murphy , "kvm@vger.kernel.org" , "iommu@lists.linux-foundation.org" , David Woodhouse , Joerg Roedel , LKML , Lu Baolu References: From: Eric Auger Message-ID: Date: Wed, 9 Jun 2021 10:14:48 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.10.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Content-Language: en-US Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Kevin, On 6/7/21 4:58 AM, Tian, Kevin wrote: > Hi, all, > > We plan to work on v2 now, given many good comments already received > and substantial changes envisioned. This is a very complex topic with > many sub-threads being discussed. To ensure that I didn't miss valuable > suggestions (and also keep everyone on the same page), here I'd like to > provide a list of planned changes in my mind. Please let me know if > anything important is lost. :) > > -- > > (Remaining opens in v1) > > - Protocol between kvm/vfio/ioasid for wbinvd/no-snoop. I'll see how > much can be refined based on discussion progress when v2 is out; > > - Device-centric (Jason) vs. group-centric (David) uAPI. David is not fully > convinced yet. Based on discussion v2 will continue to have ioasid uAPI > being device-centric (but it's fine for vfio to be group-centric). A new > section will be added to elaborate this part; > > - PASID virtualization (section 4) has not been thoroughly discussed yet. > Jason gave some suggestion on how to categorize intended usages. > I will rephrase this section and hope more discussions can be held for > it in v2; > > (Adopted suggestions) > > - (Jason) Rename /dev/ioasid to /dev/iommu (so does uAPI e.g. IOASID > _XXX to IOMMU_XXX). One suggestion (Jason) was to also rename > RID+PASID to SID+SSID. But given the familiarity of the former, I will > still use RID+PASID in v2 to ease the discussoin; > > - (Jason) v1 prevents one device from binding to multiple ioasid_fd's. This > will be fixed in v2; > > - (Jean/Jason) No need to track guest I/O page tables on ARM/AMD. When > a pasid table is bound, it becomes a container for all guest I/O page tables; while I am totally in line with that change, I guess we need to revisit the invalidate ioctl to support PASID table invalidation. > > - (Jean/Jason) Accordingly a device label is required so iotlb invalidation > and fault handling can both support per-device operation. Per Jean's > suggestion, this label will come from userspace (when VFIO_BIND_ > IOASID_FD); what is not totally clear to me is the correspondance between this label and the SID/SSID tuple. My understanding is it rather maps to the SID because you can attach several ioasids to the device. So it is not clear to me how you reconstruct the SSID info Thanks Eric > > - (Jason) Addition of device label allows per-device capability/format > check before IOASIDs are created. This leads to another major uAPI > change in v2 - specify format info when creating an IOASID (mapping > protocol, nesting, coherent, etc.). User is expected to check per-device > format and then set proper format for IOASID upon to-be-attached > device; > - (Jason/David) No restriction on map/unmap vs. bind/invalidate. They > can be used in either parent or child; > > - (David) Change IOASID_GET_INFO to report permitted range instead of > reserved IOVA ranges. This works better for PPC; > > - (Jason) For helper functions, expect to have explicit bus-type wrappers > e.g. ioasid_pci_device_attach; > > (Not adopted) > > - (Parav) Make page pinning a syscall; > - (Jason. W/Enrico) one I/O page table per fd; > - (David) Replace IOASID_REGISTER_MEMORY through another ioasid > nesting (sort of passthrough mode). Need more thinking. v2 will not > change this part; > > Thanks > Kevin >