Received: by 2002:a05:6a10:413:0:0:0:0 with SMTP id 19csp956824pxp; Wed, 16 Mar 2022 22:35:58 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy4MV8FKas0m+FhCzXp0qIue3AaHBUq2di8vwdQC1Tl8iJlTJgi4VRVQO8u6+jyOFIz2o14 X-Received: by 2002:a05:6a00:188e:b0:4f7:570:63fd with SMTP id x14-20020a056a00188e00b004f7057063fdmr3158077pfh.27.1647495357605; Wed, 16 Mar 2022 22:35:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1647495357; cv=none; d=google.com; s=arc-20160816; b=twhPdgHDJadBSb2d/m04urE/ivM5RXEpym6T/BK5jSMyFUDOdWmadYGFg8/yxMnaV9 fe5qmkDXUcxAatHgLYXQ+//Fnm9kWJFoRnogE0pkrPlnetAwul5IfBReMwdl8a5YTEms y2KyHKP6Ech3OTfLbY1SDQ2kpWHDF5Y7tuUFCPtK0P14j3I0kILW4aME36sO6zJ3YRhX s2PhucfzxZnLxrsM2fWJsvufVEhjEyI4NSUulJIA8YDI23JnPCS0d+a/h1vwNLu1/U2v 29iteCp2ur3vHjdFvFLpBYbz086zMKgLM+D5lHYszttWDl2ecFeqHOGEoVx2+xsMSs3x +vrA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=IVfChuQuBl9tidWv81ZwzKVdCtqrR9qE7AtHlAOFUrU=; b=TZnjDQzbL0YngUGM95T8GOOXPcIAXYa4ukYEjFtm+3p6xCPLZlcppiEBbIMFaQ5kRC 1g7EYrb0r3TnTwb+DSWik5Anud0i+EHafI7+NeYSO2/u0znLSQys3oqnepdyl7Fa22O8 EuGCKfLdpeg4JZiW9WV2x4npRRRmfxLbvpbLbLrSwA9E/a8iKvYIM7EpCZO7/C7RzLfY pvlUWkrhpMM4FkJMdmLFuIfwFHSmwqBx8WqjDLgUfKttuiM2Uuq6vN16y8Qs+W88tnwX r380TxVelxem9RwenPBqLLGSeOBjjGITjpnyhYqv1COsaXDU+r8z53F5cnSNqDg8SQRn NADg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=F9reZYG8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id t9-20020a170902e84900b00151eb057ddcsi3966084plg.290.2022.03.16.22.35.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 16 Mar 2022 22:35:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=F9reZYG8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 70478116284; Wed, 16 Mar 2022 21:38:12 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1349070AbiCOOSt (ORCPT + 99 others); Tue, 15 Mar 2022 10:18:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38850 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231822AbiCOOSs (ORCPT ); Tue, 15 Mar 2022 10:18:48 -0400 Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 532CE50047; Tue, 15 Mar 2022 07:17:36 -0700 (PDT) Received: from pps.filterd (m0098413.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 22FD12oq030252; Tue, 15 Mar 2022 14:17:25 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=IVfChuQuBl9tidWv81ZwzKVdCtqrR9qE7AtHlAOFUrU=; b=F9reZYG83UJO0xE59n2cUUgogCErizwga/uM10Fm4cd/leVgELNNLsgWWTpgAxQ3Mae3 bmpY8b2gVj/bYtyC/+XJVmZ353WDRLZosX+/moacNtpOPJNiQ9KnbkamfSvegELY12Y9 u3C8XGU4Qt6jdt/Qmwvc7Mmh3WU7Z4sI3+Rmj2iL3aHLJURgPNA+t/yhzJgLswuxoL/0 VrbhlVN5hxGlbCrRtA85AY0Yo/vog6g6koi0xAhQfHPDvQa/4tJqrGuwjq1hGgzNGHMb Qu4UFuZv00VC0pX7KYzqcgkeLo9fGIePkfiO/FFOfbIix7B92OHi55YIkPxTOoNGuh6u iA== Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 3etuajhv1f-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 15 Mar 2022 14:17:24 +0000 Received: from m0098413.ppops.net (m0098413.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 22FDf7LA014460; Tue, 15 Mar 2022 14:17:24 GMT Received: from ppma02wdc.us.ibm.com (aa.5b.37a9.ip4.static.sl-reverse.com [169.55.91.170]) by mx0b-001b2d01.pphosted.com with ESMTP id 3etuajhv14-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 15 Mar 2022 14:17:24 +0000 Received: from pps.filterd (ppma02wdc.us.ibm.com [127.0.0.1]) by ppma02wdc.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 22FE98qY023423; Tue, 15 Mar 2022 14:17:23 GMT Received: from b01cxnp22034.gho.pok.ibm.com (b01cxnp22034.gho.pok.ibm.com [9.57.198.24]) by ppma02wdc.us.ibm.com with ESMTP id 3erk59npvh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 15 Mar 2022 14:17:23 +0000 Received: from b01ledav006.gho.pok.ibm.com (b01ledav006.gho.pok.ibm.com [9.57.199.111]) by b01cxnp22034.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 22FEHMqk49545516 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 15 Mar 2022 14:17:22 GMT Received: from b01ledav006.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 557D9AC059; Tue, 15 Mar 2022 14:17:22 +0000 (GMT) Received: from b01ledav006.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 7E727AC065; Tue, 15 Mar 2022 14:17:09 +0000 (GMT) Received: from [9.211.32.184] (unknown [9.211.32.184]) by b01ledav006.gho.pok.ibm.com (Postfix) with ESMTP; Tue, 15 Mar 2022 14:17:09 +0000 (GMT) Message-ID: <72dd168c-dd40-356c-1fe5-02bdfca57d73@linux.ibm.com> Date: Tue, 15 Mar 2022 10:17:07 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.6.0 Subject: Re: [PATCH v4 15/32] vfio: introduce KVM-owned IOMMU type Content-Language: en-US To: "Tian, Kevin" , Jason Gunthorpe , Alex Williamson Cc: "linux-s390@vger.kernel.org" , "cohuck@redhat.com" , "schnelle@linux.ibm.com" , "farman@linux.ibm.com" , "pmorel@linux.ibm.com" , "borntraeger@linux.ibm.com" , "hca@linux.ibm.com" , "gor@linux.ibm.com" , "gerald.schaefer@linux.ibm.com" , "agordeev@linux.ibm.com" , "svens@linux.ibm.com" , "frankja@linux.ibm.com" , "david@redhat.com" , "imbrenda@linux.ibm.com" , "vneethv@linux.ibm.com" , "oberpar@linux.ibm.com" , "freude@linux.ibm.com" , "thuth@redhat.com" , "pasic@linux.ibm.com" , "joro@8bytes.org" , "will@kernel.org" , "pbonzini@redhat.com" , "corbet@lwn.net" , "kvm@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "iommu@lists.linux-foundation.org" , "linux-doc@vger.kernel.org" References: <20220314194451.58266-1-mjrosato@linux.ibm.com> <20220314194451.58266-16-mjrosato@linux.ibm.com> <20220314165033.6d2291a5.alex.williamson@redhat.com> <20220314231801.GN11336@nvidia.com> From: Matthew Rosato In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: J982lXTGru7znHLoUNVPj9khaDwswGhb X-Proofpoint-GUID: UK70Moee3Tu1hHGpP4w5h9IJVUljqARl X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.850,Hydra:6.0.425,FMLib:17.11.64.514 definitions=2022-03-15_03,2022-03-15_01,2022-02-23_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 bulkscore=0 malwarescore=0 suspectscore=0 mlxlogscore=999 impostorscore=0 adultscore=0 clxscore=1011 mlxscore=0 lowpriorityscore=0 priorityscore=1501 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2202240000 definitions=main-2203150092 X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, NICE_REPLY_A,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 3/15/22 3:57 AM, Tian, Kevin wrote: >> From: Jason Gunthorpe >> Sent: Tuesday, March 15, 2022 7:18 AM >> >> On Mon, Mar 14, 2022 at 04:50:33PM -0600, Alex Williamson wrote: >> >>>> +/* >>>> + * The KVM_IOMMU type implies that the hypervisor will control the >> mappings >>>> + * rather than userspace >>>> + */ >>>> +#define VFIO_KVM_IOMMU 11 >>> >>> Then why is this hosted in the type1 code that exposes a wide variety >>> of userspace interfaces? Thanks, >> >> It is really badly named, this is the root level of a 2 stage nested >> IO page table, and this approach needed a special flag to distinguish >> the setup from the normal iommu_domain. >> >> If we do try to stick this into VFIO it should probably use the >> VFIO_TYPE1_NESTING_IOMMU instead - however, we would like to delete >> that flag entirely as it was never fully implemented, was never used, >> and isn't part of what we are proposing for IOMMU nesting on ARM >> anyhow. (So far I've found nobody to explain what the plan here was..) >> >> This is why I said the second level should be an explicit iommu_domain >> all on its own that is explicitly coupled to the KVM to read the page >> tables, if necessary. >> >> But I'm not sure that reading the userspace io page tables with KVM is >> even the best thing to do - the iommu driver already has the pinned >> memory, it would be faster and more modular to traverse the io page >> tables through the pfns in the root iommu_domain than by having KVM do >> the translations. Lets see what Matthew says.. >> > > Reading this thread it's sort of like an optimization to software nesting. Yes, we want to avoid breaking to userspace for a very frequent operation (RPCIT / updating shadow mappings) > If that is the case does it make more sense to complete the basic form > of software nesting first and then adds this optimization? > > The basic form would allow the userspace to create a special domain > type which points to a user/guest page table (like hardware nesting) > but doesn't install the user page table to the IOMMU hardware (unlike > hardware nesting). When receiving invalidate cmd from userspace > the iommu driver walks the user page table (1st-level) and the parent > page table (2nd-level) to generate a shadow mapping for the > invalidated range in the non-nested hardware page table of this > special domain type. > > Once that works what this series does just changes the matter of > how the invalidate cmd is triggered. Previously iommu driver receives > invalidate cmd from Qemu (via iommufd uAPI) while now receiving > the cmd from kvm (via iommufd kAPI) upon interception of RPCIT. > From this angle once the connection between iommufd and kvm fd > is established there is even no direct talk between iommu driver and > kvm. But something somewhere still needs to be responsible for pinning/unpinning of the guest table entries upon each RPCIT interception. e.g. the RPCIT intercept can happen because the guest wants to invalidate some old mappings or has generated some new mappings over a range, so we must shadow the new mappings (by pinning the guest entries and placing them in the host hardware table / unpinning invalidated ones and clearing their entry in the host hardware table).