Received: by 2002:a05:7412:8521:b0:e2:908c:2ebd with SMTP id t33csp577680rdf; Fri, 3 Nov 2023 08:52:24 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHHgoeVZcBp3LsMBQIz/AHTPPjzSX9ALokW1RDfVSPmS9TzUpTbN4PiW1u4SKNy3+WgaNhT X-Received: by 2002:a17:903:24f:b0:1cc:78a5:50a2 with SMTP id j15-20020a170903024f00b001cc78a550a2mr10171480plh.39.1699026743962; Fri, 03 Nov 2023 08:52:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1699026743; cv=none; d=google.com; s=arc-20160816; b=uo/GsTGbPZKNTc87BZn+K2A/RPzkLaHh0px0dC3r4ArFcIzSrFkGT4rlOP07YJ3NhY qXEiW9REDa5/WtzQTSMtHNx9T67Fm1By4pXZ5RAVovOkY2hOhJwkrqzoKo2LbuRbdLWC 3Nsra/+qSYJFejFU6anXrEeRkrju9lEB447AxKB548XLD6vlGwO8rsvCKvOpTbsWWPdD TYTD1fEIYCnExil1pKjJsi5GuB/nfyFhkPZiY583yXO2FucJXseHE4R3Z3kUIIAJNgUM UXd9Mpxp9SlKAjDxbJV5omjNUci9axc6ltGwsS/h8CQl9jFcLyQqop10kN2UYBvVVp9O YAYg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :organization:references:in-reply-to:message-id:subject:cc:to:from :date:dkim-signature; bh=cQfUHragInpQXkGJlCrmUlsPxp9RJTqS4atEg6g9qME=; fh=I11Q/LjzHnfSNW4ZHPmGtAeH6HTAmuvsyeJ01TdUCBM=; b=JZdzKYGNa5p1dROrPfrKHNCubvo3BLIGEJCztWhqTnmYfpwkqBTIa5Y8+Z95CmVtDR sAaG09G7mt8QcddC4kOjzo3qSKWs33j9IZo66AtJAlMePsKOjGrKweAgR8olLdCEvprp PCztdy+WDbq//kB32IRJFWMZTXWja8zC3jsLQN0QCJ8o2kkaA24yAYwtNgP8dY+40MMA yqvzXVe0n/fcyewR7p+jjrNCkoBYjnT2dhV8Xm87lc7j7lo5Uv5R/PMo5xVoqULvaNHk rjEJfePXo//0BRaYqcq5/e3hrAzc+64gRlXCviEWwwAwXK5Xe2BOrV9VdKYgnbthyKsV yDjw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=JmlTba1p; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from agentk.vger.email (agentk.vger.email. [23.128.96.32]) by mx.google.com with ESMTPS id jc4-20020a17090325c400b001cc13895130si1616409plb.632.2023.11.03.08.52.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 03 Nov 2023 08:52:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) client-ip=23.128.96.32; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=JmlTba1p; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id 5236D82195E7; Fri, 3 Nov 2023 08:52:21 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343787AbjKCPwJ (ORCPT + 99 others); Fri, 3 Nov 2023 11:52:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50258 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230110AbjKCPwH (ORCPT ); Fri, 3 Nov 2023 11:52:07 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0F0C2111 for ; Fri, 3 Nov 2023 08:51:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1699026685; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cQfUHragInpQXkGJlCrmUlsPxp9RJTqS4atEg6g9qME=; b=JmlTba1pM1UX2NwooDLN+O05x451+XsMd8NwRFMzo0T3ChlwVSxpBEme0FNCyyZ3KEYc6x gj/GLi4lTEH/DmZoAjOx1EGoIiAItGTItJkkIaANE6wS95YQilIN/O08mu5JGnzaZjis2I qGT/u7/S1AJK5kLgxsSbdqfRauHyR6o= Received: from mail-il1-f200.google.com (mail-il1-f200.google.com [209.85.166.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-139-89O-kuX3Nn-PH7pA5Lgygw-1; Fri, 03 Nov 2023 11:51:24 -0400 X-MC-Unique: 89O-kuX3Nn-PH7pA5Lgygw-1 Received: by mail-il1-f200.google.com with SMTP id e9e14a558f8ab-357a7a97128so20184525ab.2 for ; Fri, 03 Nov 2023 08:51:23 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699026683; x=1699631483; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=cQfUHragInpQXkGJlCrmUlsPxp9RJTqS4atEg6g9qME=; b=mw3mgFeC90a0HMfI1P1P82gQUdcAEzcmS5AlKXKEuKLMarWHk8PFsDo+YKqyvDpM0a +nU7tbp4EFQz9QS87JuUrCBby+OIqvPpANBP3SymIGSyDhgEBrvNK/eRE4mHukon5Ihx sfHtyJouX03Po6g39wXSkqA4j8OAVxZxjvsiXen1YMTfeJWOeL3M0fPlOswkWiWLZOzl 3BOjfMARQYcYNI8ceoHe2mQlI2B95pgVBRDeblAiHtqvBp5HyYlYnD2aF+EiVMr6+357 fQ5JeQEeK3ZdoCC21QnK/cMriwBLPCAFwJQ+59WOkqtizvaj2ZSQfcaNBHWfwI63YiOf Af9Q== X-Gm-Message-State: AOJu0YxRjHP0M0ljL+jiCms2QFxk3eN9Z4My2ZtNTiIpBYQiOU2QksyE QJStDvmxSXAbg0OlfZ0pp2lBEvpGFzzMah3949H8dKzoz/v5NSxKVPokHYnRr0ybAIvd4fW5YyQ jIXQ1j7ZRDpxTeu1oXYUaua5W X-Received: by 2002:a05:6e02:1846:b0:357:8d71:347f with SMTP id b6-20020a056e02184600b003578d71347fmr34486478ilv.8.1699026683230; Fri, 03 Nov 2023 08:51:23 -0700 (PDT) X-Received: by 2002:a05:6e02:1846:b0:357:8d71:347f with SMTP id b6-20020a056e02184600b003578d71347fmr34486459ilv.8.1699026683002; Fri, 03 Nov 2023 08:51:23 -0700 (PDT) Received: from redhat.com ([38.15.60.12]) by smtp.gmail.com with ESMTPSA id b16-20020a92ce10000000b003596a440efasm281748ilo.19.2023.11.03.08.51.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 03 Nov 2023 08:51:22 -0700 (PDT) Date: Fri, 3 Nov 2023 09:51:19 -0600 From: Alex Williamson To: "Tian, Kevin" Cc: "Chatre, Reinette" , "jgg@nvidia.com" , "yishaih@nvidia.com" , "shameerali.kolothum.thodi@huawei.com" , "kvm@vger.kernel.org" , "Jiang, Dave" , "Liu, Jing2" , "Raj, Ashok" , "Yu, Fenghua" , "tom.zanussi@linux.intel.com" , "linux-kernel@vger.kernel.org" , "patches@lists.linux.dev" Subject: Re: [RFC PATCH V3 00/26] vfio/pci: Back guest interrupts from Interrupt Message Store (IMS) Message-ID: <20231103095119.63aa796f.alex.williamson@redhat.com> In-Reply-To: References: <20231101120714.7763ed35.alex.williamson@redhat.com> <20231102151352.1731de78.alex.williamson@redhat.com> Organization: Red Hat MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.3 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Fri, 03 Nov 2023 08:52:21 -0700 (PDT) On Fri, 3 Nov 2023 07:23:13 +0000 "Tian, Kevin" wrote: > > From: Alex Williamson > > Sent: Friday, November 3, 2023 5:14 AM > > > > On Thu, 2 Nov 2023 03:14:09 +0000 > > "Tian, Kevin" wrote: > > > > > > From: Tian, Kevin > > > > Sent: Thursday, November 2, 2023 10:52 AM > > > > > > > > > > > > > > Without an in-tree user of this code, we're just chopping up code for > > > > > no real purpose. There's no reason that a variant driver requiring IMS > > > > > couldn't initially implement their own SET_IRQS ioctl. Doing that > > > > > > > > this is an interesting idea. We haven't seen a real usage which wants > > > > such MSI emulation on IMS for variant drivers. but if the code is > > > > simple enough to demonstrate the 1st user of IMS it might not be > > > > a bad choice. There are additional trap-emulation required in the > > > > device MMIO bar (mostly copying MSI permission entry which contains > > > > PASID info to the corresponding IMS entry). At a glance that area > > > > is 4k-aligned so should be doable. > > > > > > > > > > misread the spec. the MSI-X permission table which provides > > > auxiliary data to MSI-X table is not 4k-aligned. It sits in the 1st > > > 4k page together with many other registers. emulation of them > > > could be simple with a native read/write handler but not sure > > > whether any of them may sit in a hot path to affect perf due to > > > trap... > > > > I'm not sure if you're referring to a specific device spec or the PCI > > spec, but the PCI spec has long included an implementation note > > suggesting alignment of the MSI-X vector table and pba and separation > > from CSRs, and I see this is now even more strongly worded in the 6.0 > > spec. > > > > Note though that for QEMU, these are emulated in the VMM and not > > written through to the device. The result of writes to the vector > > table in the VMM are translated to vector use/unuse operations, which > > we see at the kernel level through SET_IRQS ioctl calls. Are you > > expecting to get PASID information written by the guest through the > > emulated vector table? That would entail something more than a simple > > IMS backend to MSI-X frontend. Thanks, > > > > I was referring to IDXD device spec. Basically it allows a process to > submit a descriptor which contains a completion interrupt handle. > The handle is the index of a MSI-X entry or IMS entry allocated by > the idxd driver. To mark the association between application and > related handles the driver records the PASID of the application > in an auxiliary structure for MSI-X (called MSI-X permission table) > or directly in the IMS entry. This additional info includes whether > an MSI-X/IMS entry has PASID enabled and if yes what is the PASID > value to be checked against the descriptor. > > As you said virtualizing MSI-X table itself is via SET_IRQS and it's > 4k aligned. Then we also need to capture guest updates to the MSI-X > permission table and copy the PASID information into the > corresponding IMS entry when using the IMS backend. It's MSI-X > permission table not 4k aligned then trapping it will affect adjacent > registers. > > My quick check in idxd spec doesn't reveal an real impact in perf > critical path. Most registers are configuration/control registers > accessed at driver init time and a few interrupt registers related > to errors or administrative purpose. Right, it looks like you'll need to trap writes to the MSI-X Permissions Table via a sparse mmap capability to avoid assumptions whether it lives on the same page as the MSI-X vector table or PBA. Ideally the hardware folks have considered this to avoid any conflict with latency sensitive registers. The variant driver would use this for collecting the meta data relative to the IMS interrupt, but this is all tangential to whether we preemptively slice up vfio-pci-core's SET_IRQS ioctl or the iDXD driver implements its own. And just to be clear, I don't expect the iDXD variant driver to go to extraordinary lengths to duplicate the core ioctl, we can certainly refactor and export things where it makes sense, but I think it likely makes more sense for the variant driver to implement the shell of the ioctl rather than trying to multiplex the entire core ioctl with an ops structure that's so intimately tied to the core implementation and focused only on the MSI-X code paths. Thanks, Alex