Received: by 2002:ab2:7855:0:b0:1f9:5764:f03e with SMTP id m21csp366777lqp; Wed, 22 May 2024 07:03:33 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCUHQDuovl4YE3s+VhD+gUrE+Wwzk2dXkTh+6cQrHQmhpWudkO4+qE7njM/aEqShEOnJ6xmQEym9XLG9BgoVET76QSrJ5fiBUS72ypocSw== X-Google-Smtp-Source: AGHT+IFbdU/IHR6/VTMyJ44BrBl4v7s2QRxImFILjqk0BEASOc13WPuzh/cxWIAk1oEhdISYDPfg X-Received: by 2002:a05:6a20:9192:b0:1af:6428:1812 with SMTP id adf61e73a8af0-1b1ca3aa01bmr19342677637.4.1716386613502; Wed, 22 May 2024 07:03:33 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1716386613; cv=pass; d=google.com; s=arc-20160816; b=WoJHfUZb7IrwqVT4sAcuNtz0R5Fik1eRNdJvE1yYCkLdz7SLGhq1r4EVW1VqQDO2rj F3You1+ACpF9QlDaqhEIeYH7D+mQr+UILwuD98xU/oEtIS3dz+l6IHFfnVzNV6gUQGk3 f3b4m2A1RfgtR5xXnj+vv7F060QFE7qSBietAp3QUnu4RxZg/JnN3AY3ZCHAKJWEqpIY Lk37NHR2HKg5B3sMO5SqwbU59ci6AqKt3dzqx7Ou3QcVE1faNJBWphjE1zZKZ9pCdZL+ VSStJajRXpLrZ3pgkwtJACwyVTSg4YoYyh4ucPfPYRufh/meCqed89wNbNTe/0vw+UHO oGhQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:from:content-language :references:to:subject:cc:user-agent:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:date:message-id:dkim-signature; bh=HcRBPWHRiu3T/4cPCtUeVvCOFEIfSCuQrDDhHOCLwoU=; fh=5CmRKvMF1Lk/MeZbthZeyVYM1sWKHLc9kQUVCupamgY=; b=vK1fDRrPhn8RrxKhAUl75ruqCHZ7wbufldZUgOrO+P0uHLhUzgBor8y0NgFe5pe/+5 QHwgmn29FEWHmdUw7NQ5PLsKUqGoGfTp9baKlJJE8TV++vRw2RRUN99vWzk5/Ov0dgy3 4ZfTAPb6MKwHKfqi9iEV9UL3KMBIWqBabnx7OtCrF6AWuodwOUJq5xytZP8nikhdG5WS 2oNJfVFAgk2U4yu2wP0GstWf8Gp4QnKPHqplweMi4QpuOP6LXP3zgui6W7ttjHWCnNKH wlvLtwBeLhflETf3EYL2g4QcSkLxjFkuU7+5ruoOh5REYD/ZK4fYya7B88Up9xKR+Obg QfhQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="AtB4w8j/"; arc=pass (i=1 dkim=pass dkdomain=intel.com dmarc=pass fromdomain=linux.intel.com); spf=pass (google.com: domain of linux-kernel+bounces-186014-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-186014-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [2604:1380:40f1:3f00::1]) by mx.google.com with ESMTPS id 41be03b00d2f7-65e58cd68efsi4008203a12.126.2024.05.22.07.03.32 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 22 May 2024 07:03:33 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-186014-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) client-ip=2604:1380:40f1:3f00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="AtB4w8j/"; arc=pass (i=1 dkim=pass dkdomain=intel.com dmarc=pass fromdomain=linux.intel.com); spf=pass (google.com: domain of linux-kernel+bounces-186014-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-186014-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id D1158B218D0 for ; Wed, 22 May 2024 09:57:45 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 4E35181AA2; Wed, 22 May 2024 09:57:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="AtB4w8j/" Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.13]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4187B81723; Wed, 22 May 2024 09:57:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.13 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716371857; cv=none; b=ScGZJuPleSOnwJQjMJyg3gT4cWpib1rZA3xgtVbs9zMO9pJcogizcIDZ0XE5CfvmJiTgjP7YXt1sRumQjBhH3dFjedxel1uvKn4Q5EKpdVyKIT4yvcS3AL+L28vLgPmp1N97UpkOKHt795yZi98p6oHjqaanu/3nnxgXPBpfTcg= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716371857; c=relaxed/simple; bh=dxxOdiXH9g1wuJ/3cBi6DHlb9NEtTWrT0Fmqvonz0d4=; h=Message-ID:Date:MIME-Version:Cc:Subject:To:References:From: In-Reply-To:Content-Type; b=WY6BRVT17qQFFjEpL5qQ1KeMwSP7mJKhGYantmsy48PyCXRGFbW70ib6YnobDJQe7+8IlQM2TWHHjIglRhm2iiz7x4awGyAvg8mwCX5Hz602/EwSlBGJBRZVvmUFMR6V12Y+BfnnN+iZVGX6h44FOhitu931mOoUwneUNm4dyRQ= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=AtB4w8j/; arc=none smtp.client-ip=198.175.65.13 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1716371855; x=1747907855; h=message-id:date:mime-version:cc:subject:to:references: from:in-reply-to:content-transfer-encoding; bh=dxxOdiXH9g1wuJ/3cBi6DHlb9NEtTWrT0Fmqvonz0d4=; b=AtB4w8j/ro9/jpLNHlZ7Tdj5xaecSCUYjSPno8ys4Vbyif3RVphxpmJl iA9ypoyO+afCYJ2wrLAnyoQIomxXsf7F0E3A9a3RyDQ2LTd1ezR+r0PcN jk+34aqYqYe1hQ08XWUmABdhP7G7DzG+uStZ8vAdERk9vd2yOzKXjJzYS AEXms55LzZNkfmMFLwovvyfPuwPFBKwJVPGfYCve1xOVoYQfPLf0CyLIP s/umZXE3fjwmUui8iE30XhpG0Od59kbFRuKGXJClGVveWkjctItfkWhwr 6eioSJhuPJdJdN6KO95hZYykLWKkPLQyWRRP8EADCqF0nuaS5IaeqBEzV A==; X-CSE-ConnectionGUID: RZgnE18WTEuAA9Q+a2Ow6g== X-CSE-MsgGUID: kdO+Y10uR+K9aq7+yzIrAw== X-IronPort-AV: E=McAfee;i="6600,9927,11079"; a="23742131" X-IronPort-AV: E=Sophos;i="6.08,179,1712646000"; d="scan'208";a="23742131" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by orvoesa105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 May 2024 02:57:33 -0700 X-CSE-ConnectionGUID: yYYhyTfXTjiSS1atoHXAkw== X-CSE-MsgGUID: tHrzgwNSSZ+GNrt630I2bQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,179,1712646000"; d="scan'208";a="37627845" Received: from blu2-mobl.ccr.corp.intel.com (HELO [10.125.248.220]) ([10.125.248.220]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 May 2024 02:57:28 -0700 Message-ID: Date: Wed, 22 May 2024 17:57:26 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Cc: baolu.lu@linux.intel.com, "will@kernel.org" , "robin.murphy@arm.com" , "suravee.suthikulpanit@amd.com" , "joro@8bytes.org" , "linux-kernel@vger.kernel.org" , "iommu@lists.linux.dev" , "linux-arm-kernel@lists.infradead.org" , "linux-tegra@vger.kernel.org" , "Liu, Yi L" , "eric.auger@redhat.com" , "vasant.hegde@amd.com" , "jon.grimm@amd.com" , "santosh.shukla@amd.com" , "Dhaval.Giani@amd.com" , "shameerali.kolothum.thodi@huawei.com" Subject: Re: [PATCH RFCv1 04/14] iommufd: Add struct iommufd_viommu and iommufd_viommu_ops To: "Tian, Kevin" , Jason Gunthorpe , Nicolin Chen References: <8610498e3fc00000e78bb9cef6fac9f6a54978a4.1712978212.git.nicolinc@nvidia.com> Content-Language: en-US From: Baolu Lu In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit On 2024/5/22 16:58, Tian, Kevin wrote: >> From: Jason Gunthorpe >> Sent: Tuesday, May 14, 2024 11:56 PM >> >> On Sun, May 12, 2024 at 08:34:02PM -0700, Nicolin Chen wrote: >>> On Sun, May 12, 2024 at 11:03:53AM -0300, Jason Gunthorpe wrote: >>>> On Fri, Apr 12, 2024 at 08:47:01PM -0700, Nicolin Chen wrote: >>>>> Add a new iommufd_viommu core structure to represent a vIOMMU >> instance in >>>>> the user space, typically backed by a HW-accelerated feature of an >> IOMMU, >>>>> e.g. NVIDIA CMDQ-Virtualization (an ARM SMMUv3 extension) and >> AMD Hardware >>>>> Accelerated Virtualized IOMMU (vIOMMU). >>>> I expect this will also be the only way to pass in an associated KVM, >>>> userspace would supply the kvm when creating the viommu. >>>> >>>> The tricky bit of this flow is how to manage the S2. It is necessary >>>> that the S2 be linked to the viommu: >>>> >>>> 1) ARM BTM requires the VMID to be shared with KVM >>>> 2) AMD and others need the S2 translation because some of the HW >>>> acceleration is done inside the guest address space >>>> >>>> I haven't looked closely at AMD but presumably the VIOMMU create will >>>> have to install the S2 into a DID or something? >>>> >>>> So we need the S2 to exist before the VIOMMU is created, but the >>>> drivers are going to need some more fixing before that will fully >>>> work. > Can you elaborate on this point? VIOMMU is a dummy container when > it's created and the association to S2 comes relevant only until when > VQUEUE is created inside and linked to a device? then there should be > a window in between allowing the userspace to configure S2. > > Not saying against setting S2 up before vIOMMU creation. Just want > to better understand the rationale here. > >>>> Does the nesting domain create need the viommu as well (in place of >>>> the S2 hwpt)? That feels sort of natural. >>> Yes, I had a similar thought initially: each viommu is backed by >>> a nested IOMMU HW, and a special HW accelerator like VCMDQ could >>> be treated as an extension on top of that. It might not be very >>> straightforward like the current design having vintf<->viommu and >>> vcmdq <-> vqueue though... >> vqueue should be considered a sub object of the viommu and hold a >> refcount on the viommu object for its lifetime. >> >>> In that case, we can then support viommu_cache_invalidate, which >>> is quite natural for SMMUv3. Yet, I recall Kevin said that VT-d >>> doesn't want or need that. >> Right, Intel currently doesn't need it, but I feel like everyone will >> need this eventually as the fast invalidation path is quite important. >> > yes, there is no need but I don't see any harm of preparing for such > extension on VT-d. Logically it's clearer, e.g. if we decide to move > device TLB invalidation to a separate uAPI then vIOMMU is certainly > a clearer object to carry it. and hardware extensions really looks like > optimization on software implementations. > > and we do need make a decision now, given if we make vIOMMU as > a generic object for all vendors it may have potential impact on > the user page fault support which Baolu is working on. the so-called > fault object will be contained in vIOMMU, which is software managed > on VT-d/SMMU but passed through on AMD. And probably we don't > need another handle mechanism in the attach path, suppose the > vIOMMU object already contains necessary information to find out > iommufd_object for a reported fault. Yes, if the vIOMMU object tracks all iommufd devices that it manages. Best regards, baolu