Date:   Tue, 14 Dec 2021 11:40:57 +0000
Message-ID: <875yrrjtx2.wl-maz@kernel.org>
From:   Marc Zyngier <maz@kernel.org>
To:     Sunil Muthuswamy <sunilmut@microsoft.com>
Cc:     KY Srinivasan <kys@microsoft.com>,
        Haiyang Zhang <haiyangz@microsoft.com>,
        Stephen Hemminger <sthemmin@microsoft.com>,
        "wei.liu@kernel.org" <wei.liu@kernel.org>,
        Dexuan Cui <decui@microsoft.com>,
        "tglx@linutronix.de" <tglx@linutronix.de>,
        "mingo@redhat.com" <mingo@redhat.com>,
        "bp@alien8.de" <bp@alien8.de>, "hpa@zytor.com" <hpa@zytor.com>,
        "lorenzo.pieralisi@arm.com" <lorenzo.pieralisi@arm.com>,
        "robh@kernel.org" <robh@kernel.org>, "kw@linux.com" <kw@linux.com>,
        "bhelgaas@google.com" <bhelgaas@google.com>,
        "arnd@arndb.de" <arnd@arndb.de>, "x86@kernel.org" <x86@kernel.org>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
        "linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>,
        "linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
        "linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>
Subject: Re: [EXTERNAL] Re: [PATCH v6 2/2] arm64: PCI: hv: Add support for Hyper-V vPCI
In-Reply-To: <BN8PR21MB114040F48FB7F3988BA95032C0759@BN8PR21MB1140.namprd21.prod.outlook.com>
References: <1637225490-2213-1-git-send-email-sunilmut@linux.microsoft.com>
        <1637225490-2213-3-git-send-email-sunilmut@linux.microsoft.com>
        <875yso6tbi.wl-maz@kernel.org>
        <BN8PR21MB114040F48FB7F3988BA95032C0759@BN8PR21MB1140.namprd21.prod.outlook.com>
User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue)
 FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1
 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO)
MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue")
Content-Type: text/plain; charset=US-ASCII
Precedence: bulk

On Tue, 14 Dec 2021 00:46:59 +0000,
Sunil Muthuswamy <sunilmut@microsoft.com> wrote:
> 
> On Friday, November 19, 2021 7:47 AM,
> Marc Zyngier <maz@kernel.org> wrote:
> 
> [nip..]
> 
> > > +static int hv_pci_vec_alloc_device_irq(struct irq_domain *domain,
> > > +				       unsigned int nr_irqs,
> > > +				       irq_hw_number_t *hwirq)
> > > +{
> > > +	struct hv_pci_chip_data *chip_data = domain->host_data;
> > > +	unsigned int index;
> > > +
> > > +	/* Find and allocate region from the SPI bitmap */
> > > +	mutex_lock(&chip_data->map_lock);
> > > +	index = bitmap_find_free_region(chip_data->spi_map,
> > > +					HV_PCI_MSI_SPI_NR,
> > > +					get_count_order(nr_irqs));
> > > +	mutex_unlock(&chip_data->map_lock);
> > > +	if (index < 0)
> > > +		return -ENOSPC;
> > > +
> > > +	*hwirq = index + HV_PCI_MSI_SPI_START;
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +static int hv_pci_vec_irq_gic_domain_alloc(struct irq_domain *domain,
> > > +					   unsigned int virq,
> > > +					   irq_hw_number_t hwirq)
> > > +{
> > > +	struct irq_fwspec fwspec;
> > > +
> > > +	fwspec.fwnode = domain->parent->fwnode;
> > > +	fwspec.param_count = 2;
> > > +	fwspec.param[0] = hwirq;
> > > +	fwspec.param[1] = IRQ_TYPE_EDGE_RISING;
> > > +
> > > +	return irq_domain_alloc_irqs_parent(domain, virq, 1, &fwspec);
> > 
> > I think you are missing the actual edge configuration here. Since the
> > interrupt specifier doesn't come from either DT or ACPI, nobody will
> > set the trigger type, and you have to do it yourself here. At the
> > moment, you will get whatever is in the GIC configuration.
> > 
> 
> I see, thanks. So, just a call of irq_set_irq_type(IRQ_TYPE_EDGE_RISING)?

You are already deep in the irq stack, and calling a high level
function here is a pretty bad idea. You'll need something like:

	struct irq_data *d;
	d = irq_domain_get_irq_data(domain->parent, virq);
	d->chip->irq_set_type(d, IRQ_TYPE_EDGE_RISING);

on the return from the parent allocation.

> > > +}
> > > +
> > > +static int hv_pci_vec_irq_domain_alloc(struct irq_domain *domain,
> > > +				       unsigned int virq, unsigned int nr_irqs,
> > > +				       void *args)
> > > +{
> > > +	irq_hw_number_t hwirq;
> > > +	unsigned int i;
> > > +	int ret;
> > > +
> > > +	ret = hv_pci_vec_alloc_device_irq(domain, nr_irqs, &hwirq);
> > > +	if (ret)
> > > +		return ret;
> > > +
> > > +	for (i = 0; i < nr_irqs; i++) {
> > > +		ret = hv_pci_vec_irq_gic_domain_alloc(domain, virq + i,
> > > +						      hwirq + i);
> > > +		if (ret)
> > > +			goto free_irq;
> > > +
> > > +		ret = irq_domain_set_hwirq_and_chip(domain, virq + i,
> > > +						    hwirq + i,
> > > +						    &hv_arm64_msi_irq_chip,
> > > +						    domain->host_data);
> > > +		if (ret)
> > > +			goto free_irq;
> > > +
> > > +		pr_debug("pID:%d vID:%u\n", (int)(hwirq + i), virq + i);
> > > +	}
> > > +
> > > +	return 0;
> > > +
> > > +free_irq:
> > > +	hv_pci_vec_irq_domain_free(domain, virq, nr_irqs);
> > > +
> > > +	return ret;
> > 
> > How about the interrupts that have already been allocated?
> 
> Not sure I am fully following. If you are referring to the failure
> path and the interrupts that were allocated, then I am calling '
> hv_pci_vec_irq_domain_free' which should free the interrupts from
> the bitmap and the parent irq domain.  Can you please clarify?

I see several problems on the failure path:

- You are freeing more than you actually configured. Not necessary a
  big deal, but still (it is a common issue, and the core deals with
  it)

- hv_pci_vec_irq_domain_free() calls irq_domain_reset_irq_data() on a
  single pointer. Why? Either you wipe them all, or you don't.

>  
> > 
> > > +}
> > > +
> > > +/*
> > > + * Pick the first online cpu as the irq affinity that can be temporarily used
> > > + * for composing MSI from the hypervisor. GIC will eventually set the right
> > > + * affinity for the irq and the 'unmask' will retarget the interrupt to that
> > > + * cpu.
> > > + */
> > > +static int hv_pci_vec_irq_domain_activate(struct irq_domain *domain,
> > > +					  struct irq_data *irqd, bool reserve)
> > > +{
> > > +	int cpu = cpumask_first(cpu_online_mask);
> > > +
> > > +	irq_data_update_effective_affinity(irqd, cpumask_of(cpu));
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +static const struct irq_domain_ops hv_pci_domain_ops = {
> > > +	.alloc	= hv_pci_vec_irq_domain_alloc,
> > > +	.free	= hv_pci_vec_irq_domain_free,
> > > +	.activate = hv_pci_vec_irq_domain_activate,
> > > +};
> > > +
> > > +static int hv_pci_irqchip_init(void)
> > > +{
> > > +	static struct hv_pci_chip_data *chip_data;
> > > +	struct fwnode_handle *fn = NULL;
> > > +	int ret = -ENOMEM;
> > > +
> > > +	chip_data = kzalloc(sizeof(*chip_data), GFP_KERNEL);
> > > +	if (!chip_data)
> > > +		return ret;
> > > +
> > > +	mutex_init(&chip_data->map_lock);
> > > +	fn = irq_domain_alloc_named_fwnode("Hyper-V ARM64 vPCI");
> > 
> > This will appear in debugfs. I'd rather you keep it short, sweet and
> > without spaces. "hv_vpci_arm64" seems better to me.
> 
> Sure, will fix in next version.
> 
> > >
> > > @@ -1619,6 +1820,7 @@ static struct irq_chip hv_msi_irq_chip = {
> > >  	.irq_compose_msi_msg	= hv_compose_msi_msg,
> > >  	.irq_set_affinity	= irq_chip_set_affinity_parent,
> > >  	.irq_ack		= irq_chip_ack_parent,
> > > +	.irq_eoi		= irq_chip_eoi_parent,
> > >  	.irq_mask		= hv_irq_mask,
> > >  	.irq_unmask		= hv_irq_unmask,
> > 
> > You probably want to avoid unconditionally setting callbacks that may
> > have side effects on another architecture (ack on arm64, eoi on x86).
> 
> Thanks. Will fix in next version.
> 
> Is there some other feedback that would like to see get addressed in the
> current patch? Trying to close down on all remaining feedback items here.

Not at the moment, as I have paged this out a long time ago.
Addressing feedback more often than once a month would definitely
help. I usually complain about patches being sent too often, but
you're squarely in the opposite camp.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.