Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp5819543pxj; Wed, 23 Jun 2021 09:34:17 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxR3RP2VO0jDyWMGFYCL31we8wtKoel3N2nu0c0wjDuaQ6kHe2JDYLG5lXIDahEsqYK0VZC X-Received: by 2002:aa7:c686:: with SMTP id n6mr891399edq.0.1624466057739; Wed, 23 Jun 2021 09:34:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1624466057; cv=none; d=google.com; s=arc-20160816; b=RnhvrOiGej/leTr3mwCo8AZfXO8v2luQieNDSpZrbhCLZcAlAwab9DeZbfRQlI2n2C yKzGUpbpXFz3ujkTrcOxWCUNZBDoZ7EY22YDT6rqmO/Ev+Hii+8ZArONdXeFOfMHG2b9 yCRAYbtAZjUQ/UebrC6LBBAjwDBVHkJanUl5UlnmEK0B+ky67U1mo1VeIXl9SnmNNMFr JT56F7uvoDif5csViuiBX6ypFedmUQTQYnptC42ODE6o70YpmM3YDJVFF/tK5+fFjnvn Ae5DDOyP/gH9xxQ/l2nOa5j/UGEskajsZh6SM4zRuJYNI3u99pD+kqbIayCp/rYkpggj 71YQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:references :in-reply-to:subject:cc:to:dkim-signature:dkim-signature:from; bh=vum7XhG4Y/D6N0CnClS6m9YiPMuQUmjUuLSjPUe4UiU=; b=wxhWoMIkrB9JOIbnayzCtNMMF8cm/wKyyNBbId+52kP+glYaQh/Bdki6x7bhqzpbYr pvo50QmP1NbwDWeQFY3fa7TNkebN7qdxdaylaYt0qy0rg8hA+XlGeV9mu+O7i3EYrtnG uVQgADrsXNLIupBmAwjnHBjL8f9qDcw69in2Aub1sZsY6rqMuqTigZfRquBxivdoZefI JhDv67RK22Nm9ecyvJf5o8l7uj5mgXidMhBPHm/n5VqTyAltbQImegu1pKH3lExzLui8 A3Lra+vSS0ainuhAyT+/PbJzpmqVVYrrwdHAawV507LCXc+IIrIlMyXgRpcfN0+ayZUz agOQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b="18q/s0Xe"; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id s11si288495edq.501.2021.06.23.09.33.55; Wed, 23 Jun 2021 09:34:17 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b="18q/s0Xe"; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229889AbhFWQdy (ORCPT + 99 others); Wed, 23 Jun 2021 12:33:54 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:38514 "EHLO galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229660AbhFWQdx (ORCPT ); Wed, 23 Jun 2021 12:33:53 -0400 From: Thomas Gleixner DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1624465895; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=vum7XhG4Y/D6N0CnClS6m9YiPMuQUmjUuLSjPUe4UiU=; b=18q/s0XeWySvRpSD4BYDKY2u8rjH27TrHKNHyCwaERbA2eDl9yPULzJTIeXzLbaQmRLGZk A1UqKD9EFVmq/Rx4yrsFNWA9lQRff0+LFrI4+CckZPhxIhXM+Wb2Nd2qunWK85gdr2Y3Gm Ri8qKu2H2ghtfoKx2sNlarTy8J1QWVJZfOmEK2u36npqCbC+7UWmWPzqW3rPE8gtW3Tx9x WbBmLcZsOR/onwmu77lsVzZgSueBq922cO3XZUJqPoEBvm/v8ZsUa5+QAL4e1R8IzZpkSG 4EAprE5Fuj7Z/Z8D2qrnNrM3kWbsxQyhiKQlPVk6Y3RHF4LJnHS3K6CfR5d/Uw== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1624465895; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=vum7XhG4Y/D6N0CnClS6m9YiPMuQUmjUuLSjPUe4UiU=; b=cCSiRfYI4Pn9pQ/AytopE5opHmoRRL5BB7oRhn6jszRJQsvy4Y1D155HtVPYk9E/leEBw4 0p9D1eDFTEb0iRAw== To: "Tian\, Kevin" , Alex Williamson Cc: Jason Gunthorpe , "Dey\, Megha" , "Raj\, Ashok" , "Pan\, Jacob jun" , "Jiang\, Dave" , "Liu\, Yi L" , "Lu\, Baolu" , "Williams\, Dan J" , "Luck\, Tony" , "Kumar\, Sanjay K" , LKML , KVM , Kirti Wankhede , Peter Zijlstra , Marc Zyngier , Bjorn Helgaas Subject: RE: Virtualizing MSI-X on IMS via VFIO In-Reply-To: References: <20210622131217.76b28f6f.alex.williamson@redhat.com> <87o8bxcuxv.ffs@nanos.tec.linutronix.de> Date: Wed, 23 Jun 2021 18:31:34 +0200 Message-ID: <87bl7wczkp.ffs@nanos.tec.linutronix.de> MIME-Version: 1.0 Content-Type: text/plain Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jun 23 2021 at 06:12, Kevin Tian wrote: >> From: Thomas Gleixner >> So the only downside today of allocating more MSI-X vectors than >> necessary is memory consumption for the irq descriptors. > > Curious about irte entry when IRQ remapping is enabled. Is it also > allocated at request_irq()? Good question. No, it has to be allocated right away. We stick the shutdown vector into the IRTE and then request_irq() will update it with the real one. > So the correct flow is like below: > > guest::enable_msix() > trapped_by_host() > pci_alloc_irq_vectors(); // for all possible vMSI-X entries > pci_enable_msix(); > > guest::unmask() > trapped_by_host() > request_irqs(); > > the first trap calls a new VFIO ioctl e.g. VFIO_DEVICE_ALLOC_IRQS. > > the 2nd trap can reuse existing VFIO_DEVICE_SET_IRQS which just > does request_irq() if specified irqs have been allocated. > > Then map ims to this flow: > > guest::enable_msix() > trapped_by_host() > msi_domain_alloc_irqs(); // for all possible vMSI-X entries > for_all_allocated_irqs(i) > pci_update_msi_desc_id(i, default_pasid); // a new helper func > > guest::unmask(entry#0) > trapped_by_host() > request_irqs(); > ims_array_irq_startup(); // write msi_desc.id (default_pasid) to ims entry > > guest::set_msix_perm(entry#1, guest_sva_pasid) > trapped_by_host() > pci_update_msi_desc_id(1, host_sva_pasid); > > guest::unmask(entry#1) > trapped_by_host() > request_irqs(); > ims_array_irq_startup(); // write msi_desc.id (host_sva_pasid) to ims entry That's one way to do that, but that still has the same problem that the request_irq() in the guest succeeds even if the host side fails. As this is really new stuff there is no real good reason to force that into the existing VFIO/MSIX stuff with all it's known downsides and limitations. The point is, that IMS can just add another interrupt to a device on the fly without doing any of the PCI/MSIX nasties. So why not take advantage of that? I can see the point of using PCI to expose the device to the guest because it's trivial to enumerate, but contrary to VF devices there is no legacy and the mechanism how to setup the device interrupts can be completely different from PCI/MSIX. Exposing some trappable "IMS" storage in a separate PCI bar won't cut it because this still has the same problem that the allocation or request_irq() on the host can fail w/o feedback. So IMO creating a proper paravirt interface is the right approach. It avoids _all_ of the trouble and will be necessary anyway once you want to support devices which store the message/pasid in system memory and not in on-device memory. Thanks, tglx