Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp1676007ybz; Thu, 16 Apr 2020 13:28:14 -0700 (PDT) X-Google-Smtp-Source: APiQypK0a/l9JnDvngMHZWmiRmCfnLkb0TwqKJlZ4Gx90/SBtB3YUdFc7w4g+jFxaUkXrtWYcGSj X-Received: by 2002:aa7:d3d6:: with SMTP id o22mr37350edr.52.1587068894782; Thu, 16 Apr 2020 13:28:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1587068894; cv=none; d=google.com; s=arc-20160816; b=mqRunKuRmBcE6Um30SZHgms7L1j+qVHgnDJpgYB0DnV3umZQ95IYNTa6B42MxJQJOk r8+s1Sy95WwnmvwBxI6+JuIKijZrFs6aK+bv9JURvU5rNEx3mZM4XRuk8L3I6Bvzk8BK LfKq8s5MVfeNPvUh/lgdnOj+syxRAnHgLuGGKdYnKyPTDT4+Du1NS0h6GSOvANT9E/R7 mc1u8l2R6uxOhzuQ0UnqTtJtpCLIp+r3f2qGhf+QPC+iiyDpawUj12GgmRaeWOFfP5+q Vzvn5GH2P+m23gmjF5MuwZBJFYzxkZciGcC5rivPJIskU+SB0cvQvLucZYX5wd9r4VLk v2VQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=80aHjGWongnpeQZzKC1CPTA8jUpzqlKutXIbnyqQU+g=; b=fH+ti/192oJQjw0cw3cbgc8Vx6drLUuZjPhwEjTSnQcSq7rqDDgEb0g/Tlqbz9WPY4 wjJYbr/LDdX/1Xwjb94kPDB/aSzcjkIJ7xcpbWVJjem4JrRLE+BVDIvkzDy27Nxk4+DU F1BV0LYgM9MfXUQBKl2y50XHtmRujpnFp8LrJY3dKDyV/i/dCz+adc1olV6OL6eWa7jZ 9SzdEP1gV9reLAa26+HXVAcgdzGPsaeG3wpqK04J4ibCIC5ajoEQqeo98WLPga18NTwN pe/T+QFdC+xJ/4LlO0IpLr4+6QHQ0+LF+p62TtXAiHRaWSsTX2jYNAnyTv27MJhdQriv 7gfw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=XUDIJ2lA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y14si13396221edm.185.2020.04.16.13.27.52; Thu, 16 Apr 2020 13:28:14 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=XUDIJ2lA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2394351AbgDPOk7 (ORCPT + 99 others); Thu, 16 Apr 2020 10:40:59 -0400 Received: from us-smtp-2.mimecast.com ([207.211.31.81]:46399 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2394425AbgDPOko (ORCPT ); Thu, 16 Apr 2020 10:40:44 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1587048042; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=80aHjGWongnpeQZzKC1CPTA8jUpzqlKutXIbnyqQU+g=; b=XUDIJ2lAQAeiRSxu+ZVH/HmvTO58ygJ+a0KpR2audKl//2CqOWyrISSOJ97AX0XhZqB2+m kOvveuq5AmCM21NllwNWEnwtUHtkp1bqpub+T6tr5Bu1RmFlsPj79/hvtavaHfLBxNOWfc iuJTR2GbSSUpiDtK6qQqHRlsCw+zJAI= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-357--hdoMGM2MUCQI3EYGcbQRg-1; Thu, 16 Apr 2020 10:40:41 -0400 X-MC-Unique: -hdoMGM2MUCQI3EYGcbQRg-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 16E2E107B267; Thu, 16 Apr 2020 14:40:39 +0000 (UTC) Received: from w520.home (ovpn-112-162.phx2.redhat.com [10.3.112.162]) by smtp.corp.redhat.com (Postfix) with ESMTP id EFC4B116D95; Thu, 16 Apr 2020 14:40:31 +0000 (UTC) Date: Thu, 16 Apr 2020 08:40:31 -0600 From: Alex Williamson To: "Liu, Yi L" Cc: "Tian, Kevin" , "eric.auger@redhat.com" , "jacob.jun.pan@linux.intel.com" , "joro@8bytes.org" , "Raj, Ashok" , "Tian, Jun J" , "Sun, Yi Y" , "jean-philippe@linaro.org" , "peterx@redhat.com" , "iommu@lists.linux-foundation.org" , "kvm@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "Wu, Hao" Subject: Re: [PATCH v1 7/8] vfio/type1: Add VFIO_IOMMU_CACHE_INVALIDATE Message-ID: <20200416084031.7266ad40@w520.home> In-Reply-To: References: <1584880325-10561-1-git-send-email-yi.l.liu@intel.com> <1584880325-10561-8-git-send-email-yi.l.liu@intel.com> <20200402142428.2901432e@w520.home> <20200403093436.094b1928@w520.home> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 16 Apr 2020 10:40:03 +0000 "Liu, Yi L" wrote: > Hi Alex, > Still have a direction question with you. Better get agreement with you > before heading forward. > > > From: Alex Williamson > > Sent: Friday, April 3, 2020 11:35 PM > [...] > > > > > + * > > > > > + * returns: 0 on success, -errno on failure. > > > > > + */ > > > > > +struct vfio_iommu_type1_cache_invalidate { > > > > > + __u32 argsz; > > > > > + __u32 flags; > > > > > + struct iommu_cache_invalidate_info cache_info; > > > > > +}; > > > > > +#define VFIO_IOMMU_CACHE_INVALIDATE _IO(VFIO_TYPE, > > VFIO_BASE > > > > + 24) > > > > > > > > The future extension capabilities of this ioctl worry me, I wonder if > > > > we should do another data[] with flag defining that data as CACHE_INFO. > > > > > > Can you elaborate? Does it mean with this way we don't rely on iommu > > > driver to provide version_to_size conversion and instead we just pass > > > data[] to iommu driver for further audit? > > > > No, my concern is that this ioctl has a single function, strictly tied > > to the iommu uapi. If we replace cache_info with data[] then we can > > define a flag to specify that data[] is struct > > iommu_cache_invalidate_info, and if we need to, a different flag to > > identify data[] as something else. For example if we get stuck > > expanding cache_info to meet new demands and develop a new uapi to > > solve that, how would we expand this ioctl to support it rather than > > also create a new ioctl? There's also a trade-off in making the ioctl > > usage more difficult for the user. I'd still expect the vfio layer to > > check the flag and interpret data[] as indicated by the flag rather > > than just passing a blob of opaque data to the iommu layer though. > > Thanks, > > Based on your comments about defining a single ioctl and a unified > vfio structure (with a @data[] field) for pasid_alloc/free, bind/ > unbind_gpasid, cache_inv. After some offline trying, I think it would > be good for bind/unbind_gpasid and cache_inv as both of them use the > iommu uapi definition. While the pasid alloc/free operation doesn't. > It would be weird to put all of them together. So pasid alloc/free > may have a separate ioctl. It would look as below. Does this direction > look good per your opinion? > > ioctl #22: VFIO_IOMMU_PASID_REQUEST > /** > * @pasid: used to return the pasid alloc result when flags == ALLOC_PASID > * specify a pasid to be freed when flags == FREE_PASID > * @range: specify the allocation range when flags == ALLOC_PASID > */ > struct vfio_iommu_pasid_request { > __u32 argsz; > #define VFIO_IOMMU_ALLOC_PASID (1 << 0) > #define VFIO_IOMMU_FREE_PASID (1 << 1) > __u32 flags; > __u32 pasid; > struct { > __u32 min; > __u32 max; > } range; > }; Can't the ioctl return the pasid valid on alloc (like GET_DEVICE_FD)? Would it be useful to support freeing a range of pasids? If so then we could simply use range for both, ie. allocate a pasid from this range and return it, or free all pasids in this range? vfio already needs to track pasids to free them on release, so presumably this is something we could support easily. > ioctl #23: VFIO_IOMMU_NESTING_OP > struct vfio_iommu_type1_nesting_op { > __u32 argsz; > __u32 flags; > __u32 op; > __u8 data[]; > }; data only has 4-byte alignment, I think we really want it at an 8-byte alignment. This is why I embedded the "op" into the flag for DEVICE_FEATURE. Thanks, Alex > > /* Nesting Ops */ > #define VFIO_IOMMU_NESTING_OP_BIND_PGTBL 0 > #define VFIO_IOMMU_NESTING_OP_UNBIND_PGTBL 1 > #define VFIO_IOMMU_NESTING_OP_CACHE_INVLD 2 > > Thanks, > Yi Liu >