2020-05-26 12:58:15

by zhangfei

[permalink] [raw]
Subject: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU

Some platform devices appear as PCI but are actually on the AMBA bus,
and they need fixup in drivers/pci/quirks.c handling iommu_fwnode.
Here introducing PCI_FIXUP_IOMMU, which is called after iommu_fwnode
is allocated, instead of reusing PCI_FIXUP_FINAL since it will slow
down iommu probing as all devices in fixup final list will be
reprocessed, suggested by Joerg, [1]

For example:
Hisilicon platform device need fixup in
drivers/pci/quirks.c handling fwspec->can_stall, which is introduced in [2]

+static void quirk_huawei_pcie_sva(struct pci_dev *pdev)
+{
+ struct iommu_fwspec *fwspec;
+
+ pdev->eetlp_prefix_path = 1;
+ fwspec = dev_iommu_fwspec_get(&pdev->dev);
+ if (fwspec)
+ fwspec->can_stall = 1;
+}
+
+DECLARE_PCI_FIXUP_IOMMU(PCI_VENDOR_ID_HUAWEI, 0xa250, quirk_huawei_pcie_sva);
+DECLARE_PCI_iFIXUP_IOMMU(PCI_VENDOR_ID_HUAWEI, 0xa251, quirk_huawei_pcie_sva);

[1] https://www.spinics.net/lists/iommu/msg44591.html
[2] https://www.spinics.net/lists/linux-pci/msg94559.html

Zhangfei Gao (2):
PCI: Introduce PCI_FIXUP_IOMMU
iommu: calling pci_fixup_iommu in iommu_fwspec_init

drivers/iommu/iommu.c | 4 ++++
drivers/pci/quirks.c | 7 +++++++
include/asm-generic/vmlinux.lds.h | 3 +++
include/linux/pci.h | 8 ++++++++
4 files changed, 22 insertions(+)

--
2.7.4


2020-05-26 12:59:36

by zhangfei

[permalink] [raw]
Subject: [PATCH 1/2] PCI: Introduce PCI_FIXUP_IOMMU

Some platform devices appear as PCI but are actually on the AMBA bus,
and they need fixup in drivers/pci/quirks.c handling iommu_fwnode.
Here introducing PCI_FIXUP_IOMMU, which is called after iommu_fwnode
is allocated, instead of reusing PCI_FIXUP_FINAL since it will slow
down iommu probing as all devices in fixup final list will be
reprocessed.

Suggested-by: Joerg Roedel <[email protected]>
Signed-off-by: Zhangfei Gao <[email protected]>
---
drivers/pci/quirks.c | 7 +++++++
include/asm-generic/vmlinux.lds.h | 3 +++
include/linux/pci.h | 8 ++++++++
3 files changed, 18 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index ca9ed57..b037034 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -83,6 +83,8 @@ extern struct pci_fixup __start_pci_fixups_header[];
extern struct pci_fixup __end_pci_fixups_header[];
extern struct pci_fixup __start_pci_fixups_final[];
extern struct pci_fixup __end_pci_fixups_final[];
+extern struct pci_fixup __start_pci_fixups_iommu[];
+extern struct pci_fixup __end_pci_fixups_iommu[];
extern struct pci_fixup __start_pci_fixups_enable[];
extern struct pci_fixup __end_pci_fixups_enable[];
extern struct pci_fixup __start_pci_fixups_resume[];
@@ -118,6 +120,11 @@ void pci_fixup_device(enum pci_fixup_pass pass, struct pci_dev *dev)
end = __end_pci_fixups_final;
break;

+ case pci_fixup_iommu:
+ start = __start_pci_fixups_iommu;
+ end = __end_pci_fixups_iommu;
+ break;
+
case pci_fixup_enable:
start = __start_pci_fixups_enable;
end = __end_pci_fixups_enable;
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 71e387a..3da32d8 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -411,6 +411,9 @@
__start_pci_fixups_final = .; \
KEEP(*(.pci_fixup_final)) \
__end_pci_fixups_final = .; \
+ __start_pci_fixups_iommu = .; \
+ KEEP(*(.pci_fixup_iommu)) \
+ __end_pci_fixups_iommu = .; \
__start_pci_fixups_enable = .; \
KEEP(*(.pci_fixup_enable)) \
__end_pci_fixups_enable = .; \
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 83ce1cd..0d5fbf8 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1892,6 +1892,7 @@ enum pci_fixup_pass {
pci_fixup_early, /* Before probing BARs */
pci_fixup_header, /* After reading configuration header */
pci_fixup_final, /* Final phase of device fixups */
+ pci_fixup_iommu, /* After iommu_fwspec_init */
pci_fixup_enable, /* pci_enable_device() time */
pci_fixup_resume, /* pci_device_resume() */
pci_fixup_suspend, /* pci_device_suspend() */
@@ -1934,6 +1935,10 @@ enum pci_fixup_pass {
class_shift, hook) \
DECLARE_PCI_FIXUP_SECTION(.pci_fixup_final, \
hook, vendor, device, class, class_shift, hook)
+#define DECLARE_PCI_FIXUP_CLASS_IOMMU(vendor, device, class, \
+ class_shift, hook) \
+ DECLARE_PCI_FIXUP_SECTION(.pci_fixup_iommu, \
+ hook, vendor, device, class, class_shift, hook)
#define DECLARE_PCI_FIXUP_CLASS_ENABLE(vendor, device, class, \
class_shift, hook) \
DECLARE_PCI_FIXUP_SECTION(.pci_fixup_enable, \
@@ -1964,6 +1969,9 @@ enum pci_fixup_pass {
#define DECLARE_PCI_FIXUP_FINAL(vendor, device, hook) \
DECLARE_PCI_FIXUP_SECTION(.pci_fixup_final, \
hook, vendor, device, PCI_ANY_ID, 0, hook)
+#define DECLARE_PCI_FIXUP_IOMMU(vendor, device, hook) \
+ DECLARE_PCI_FIXUP_SECTION(.pci_fixup_iommu, \
+ hook, vendor, device, PCI_ANY_ID, 0, hook)
#define DECLARE_PCI_FIXUP_ENABLE(vendor, device, hook) \
DECLARE_PCI_FIXUP_SECTION(.pci_fixup_enable, \
hook, vendor, device, PCI_ANY_ID, 0, hook)
--
2.7.4

2020-05-26 13:00:00

by zhangfei

[permalink] [raw]
Subject: [PATCH 2/2] iommu: calling pci_fixup_iommu in iommu_fwspec_init

Calling pci_fixup_iommu in iommu_fwspec_init, which alloc
iommu_fwnode. Some platform devices appear as PCI but are
actually on the AMBA bus, and they need fixup in
drivers/pci/quirks.c handling iommu_fwnode.
So calling pci_fixup_iommu after iommu_fwnode is allocated.

Signed-off-by: Zhangfei Gao <[email protected]>
---
drivers/iommu/iommu.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 7b37542..fb84c42 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode,
fwspec->iommu_fwnode = iommu_fwnode;
fwspec->ops = ops;
dev_iommu_fwspec_set(dev, fwspec);
+
+ if (dev_is_pci(dev))
+ pci_fixup_device(pci_fixup_iommu, to_pci_dev(dev));
+
return 0;
}
EXPORT_SYMBOL_GPL(iommu_fwspec_init);
--
2.7.4

2020-05-26 14:49:50

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 1/2] PCI: Introduce PCI_FIXUP_IOMMU

On Tue, May 26, 2020 at 07:49:08PM +0800, Zhangfei Gao wrote:
> Some platform devices appear as PCI but are actually on the AMBA bus,
> and they need fixup in drivers/pci/quirks.c handling iommu_fwnode.
> Here introducing PCI_FIXUP_IOMMU, which is called after iommu_fwnode
> is allocated, instead of reusing PCI_FIXUP_FINAL since it will slow
> down iommu probing as all devices in fixup final list will be
> reprocessed.

Who is going to use this? I don't see a single user in the series.

2020-05-26 15:11:27

by zhangfei

[permalink] [raw]
Subject: Re: [PATCH 1/2] PCI: Introduce PCI_FIXUP_IOMMU

Hi, Christoph

On 2020/5/26 下午10:46, Christoph Hellwig wrote:
> On Tue, May 26, 2020 at 07:49:08PM +0800, Zhangfei Gao wrote:
>> Some platform devices appear as PCI but are actually on the AMBA bus,
>> and they need fixup in drivers/pci/quirks.c handling iommu_fwnode.
>> Here introducing PCI_FIXUP_IOMMU, which is called after iommu_fwnode
>> is allocated, instead of reusing PCI_FIXUP_FINAL since it will slow
>> down iommu probing as all devices in fixup final list will be
>> reprocessed.
> Who is going to use this? I don't see a single user in the series.
We will add iommu fixup in drivers/pci/quirks.c, handling

fwspec->can_stall, which is introduced in

https://www.spinics.net/lists/linux-pci/msg94559.html

Unfortunately, the patch does not catch v5.8, so we have to wait.
And we want to check whether this is a right method to solve this issue.

Thanks

2020-05-27 09:01:41

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU

On Tue, May 26, 2020 at 07:49:07PM +0800, Zhangfei Gao wrote:
> Some platform devices appear as PCI but are actually on the AMBA bus,

Why would these devices not just show up on the AMBA bus and use all of
that logic instead of being a PCI device and having to go through odd
fixes like this?

thanks,

greg k-h

2020-05-27 09:03:37

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 1/2] PCI: Introduce PCI_FIXUP_IOMMU

On Tue, May 26, 2020 at 11:09:57PM +0800, Zhangfei Gao wrote:
> Hi, Christoph
>
> On 2020/5/26 下午10:46, Christoph Hellwig wrote:
> > On Tue, May 26, 2020 at 07:49:08PM +0800, Zhangfei Gao wrote:
> > > Some platform devices appear as PCI but are actually on the AMBA bus,
> > > and they need fixup in drivers/pci/quirks.c handling iommu_fwnode.
> > > Here introducing PCI_FIXUP_IOMMU, which is called after iommu_fwnode
> > > is allocated, instead of reusing PCI_FIXUP_FINAL since it will slow
> > > down iommu probing as all devices in fixup final list will be
> > > reprocessed.
> > Who is going to use this? I don't see a single user in the series.
> We will add iommu fixup in drivers/pci/quirks.c, handling
>
> fwspec->can_stall, which is introduced in
>
> https://www.spinics.net/lists/linux-pci/msg94559.html
>
> Unfortunately, the patch does not catch v5.8, so we have to wait.
> And we want to check whether this is a right method to solve this issue.

We can't take new apis without a real user, so please submit them all at
once.

thanks,

greg k-h

2020-05-27 09:04:01

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 2/2] iommu: calling pci_fixup_iommu in iommu_fwspec_init

On Tue, May 26, 2020 at 07:49:09PM +0800, Zhangfei Gao wrote:
> Calling pci_fixup_iommu in iommu_fwspec_init, which alloc
> iommu_fwnode. Some platform devices appear as PCI but are
> actually on the AMBA bus, and they need fixup in
> drivers/pci/quirks.c handling iommu_fwnode.
> So calling pci_fixup_iommu after iommu_fwnode is allocated.
>
> Signed-off-by: Zhangfei Gao <[email protected]>
> ---
> drivers/iommu/iommu.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 7b37542..fb84c42 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode,
> fwspec->iommu_fwnode = iommu_fwnode;
> fwspec->ops = ops;
> dev_iommu_fwspec_set(dev, fwspec);
> +
> + if (dev_is_pci(dev))
> + pci_fixup_device(pci_fixup_iommu, to_pci_dev(dev));

Why can't the caller do this as it "knows" it is a PCI device at that
point in time, right?

thanks,

greg k-h

2020-05-27 11:32:10

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU

On Wed, May 27, 2020 at 11:00 AM Greg Kroah-Hartman
<[email protected]> wrote:
>
> On Tue, May 26, 2020 at 07:49:07PM +0800, Zhangfei Gao wrote:
> > Some platform devices appear as PCI but are actually on the AMBA bus,
>
> Why would these devices not just show up on the AMBA bus and use all of
> that logic instead of being a PCI device and having to go through odd
> fixes like this?

There is a general move to having hardware be discoverable even with
ARM processors. Having on-chip devices be discoverable using PCI config
space is how x86 SoCs usually do it, and that is generally a good thing
as it means we don't need to describe them in DT

I guess as the hardware designers are still learning about it, this is not
always done correctly. In general, we can also describe PCI devices on
DT and do fixups during the probing there, but I suspect that won't work
as easily using ACPI probing, so the fixup is keyed off the hardware ID,
again as is common for x86 on-chip devices.

Arnd

2020-05-27 13:53:10

by zhangfei

[permalink] [raw]
Subject: Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU



On 2020/5/27 下午5:53, Arnd Bergmann wrote:
> On Wed, May 27, 2020 at 11:00 AM Greg Kroah-Hartman
> <[email protected]> wrote:
>> On Tue, May 26, 2020 at 07:49:07PM +0800, Zhangfei Gao wrote:
>>> Some platform devices appear as PCI but are actually on the AMBA bus,
>> Why would these devices not just show up on the AMBA bus and use all of
>> that logic instead of being a PCI device and having to go through odd
>> fixes like this?
> There is a general move to having hardware be discoverable even with
> ARM processors. Having on-chip devices be discoverable using PCI config
> space is how x86 SoCs usually do it, and that is generally a good thing
> as it means we don't need to describe them in DT
>
> I guess as the hardware designers are still learning about it, this is not
> always done correctly. In general, we can also describe PCI devices on
> DT and do fixups during the probing there, but I suspect that won't work
> as easily using ACPI probing, so the fixup is keyed off the hardware ID,
> again as is common for x86 on-chip devices.
>
>
Yes, thanks Arnd :)

In order to use pasid, io page fault has to be supported,
either by PCI PRI feature (from pci device) or stall mode from smmu
(platform device).
Here is letting system know the platform device can support smmu stall
mode, as a result support pasid.
While stall is not a pci capability, so we use a fixup here.

Thanks

2020-05-27 19:22:06

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU

On Tue, May 26, 2020 at 07:49:07PM +0800, Zhangfei Gao wrote:
> Some platform devices appear as PCI but are actually on the AMBA bus,
> and they need fixup in drivers/pci/quirks.c handling iommu_fwnode.
> Here introducing PCI_FIXUP_IOMMU, which is called after iommu_fwnode
> is allocated, instead of reusing PCI_FIXUP_FINAL since it will slow
> down iommu probing as all devices in fixup final list will be
> reprocessed, suggested by Joerg, [1]

Is this slowdown significant? We already iterate over every device
when applying PCI_FIXUP_FINAL quirks, so if we used the existing
PCI_FIXUP_FINAL, we wouldn't be adding a new loop. We would only be
adding two more iterations to the loop in pci_do_fixups() that tries
to match quirks against the current device. I doubt that would be a
measurable slowdown.

> For example:
> Hisilicon platform device need fixup in
> drivers/pci/quirks.c handling fwspec->can_stall, which is introduced in [2]
>
> +static void quirk_huawei_pcie_sva(struct pci_dev *pdev)
> +{
> + struct iommu_fwspec *fwspec;
> +
> + pdev->eetlp_prefix_path = 1;
> + fwspec = dev_iommu_fwspec_get(&pdev->dev);
> + if (fwspec)
> + fwspec->can_stall = 1;
> +}
> +
> +DECLARE_PCI_FIXUP_IOMMU(PCI_VENDOR_ID_HUAWEI, 0xa250, quirk_huawei_pcie_sva);
> +DECLARE_PCI_iFIXUP_IOMMU(PCI_VENDOR_ID_HUAWEI, 0xa251, quirk_huawei_pcie_sva);
>
> [1] https://www.spinics.net/lists/iommu/msg44591.html
> [2] https://www.spinics.net/lists/linux-pci/msg94559.html

If you reference these in the commit logs, please use lore.kernel.org
links instead of spinics.

> Zhangfei Gao (2):
> PCI: Introduce PCI_FIXUP_IOMMU
> iommu: calling pci_fixup_iommu in iommu_fwspec_init
>
> drivers/iommu/iommu.c | 4 ++++
> drivers/pci/quirks.c | 7 +++++++
> include/asm-generic/vmlinux.lds.h | 3 +++
> include/linux/pci.h | 8 ++++++++
> 4 files changed, 22 insertions(+)
>
> --
> 2.7.4
>

2020-05-28 06:47:49

by zhangfei

[permalink] [raw]
Subject: Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU

Hi, Bjorn

On 2020/5/28 上午2:18, Bjorn Helgaas wrote:
> On Tue, May 26, 2020 at 07:49:07PM +0800, Zhangfei Gao wrote:
>> Some platform devices appear as PCI but are actually on the AMBA bus,
>> and they need fixup in drivers/pci/quirks.c handling iommu_fwnode.
>> Here introducing PCI_FIXUP_IOMMU, which is called after iommu_fwnode
>> is allocated, instead of reusing PCI_FIXUP_FINAL since it will slow
>> down iommu probing as all devices in fixup final list will be
>> reprocessed, suggested by Joerg, [1]
> Is this slowdown significant? We already iterate over every device
> when applying PCI_FIXUP_FINAL quirks, so if we used the existing
> PCI_FIXUP_FINAL, we wouldn't be adding a new loop. We would only be
> adding two more iterations to the loop in pci_do_fixups() that tries
> to match quirks against the current device. I doubt that would be a
> measurable slowdown.
I do not notice the difference when compared fixup_iommu and fixup_final
via get_jiffies_64,
since in our platform no other pci fixup is registered.

Here the plan is adding pci_fixup_device in iommu_fwspec_init,
so if using fixup_final the iteration will be done again here.

>
>> For example:
>> Hisilicon platform device need fixup in
>> drivers/pci/quirks.c handling fwspec->can_stall, which is introduced in [2]
>>
>> +static void quirk_huawei_pcie_sva(struct pci_dev *pdev)
>> +{
>> + struct iommu_fwspec *fwspec;
>> +
>> + pdev->eetlp_prefix_path = 1;
>> + fwspec = dev_iommu_fwspec_get(&pdev->dev);
>> + if (fwspec)
>> + fwspec->can_stall = 1;
>> +}
>> +
>> +DECLARE_PCI_FIXUP_IOMMU(PCI_VENDOR_ID_HUAWEI, 0xa250, quirk_huawei_pcie_sva);
>> +DECLARE_PCI_iFIXUP_IOMMU(PCI_VENDOR_ID_HUAWEI, 0xa251, quirk_huawei_pcie_sva);
>>
>> [1] https://www.spinics.net/lists/iommu/msg44591.html
>> [2] https://www.spinics.net/lists/linux-pci/msg94559.html
> If you reference these in the commit logs, please use lore.kernel.org
> links instead of spinics.
Got it, thanks Bjorn.



2020-05-28 06:54:33

by zhangfei

[permalink] [raw]
Subject: Re: [PATCH 2/2] iommu: calling pci_fixup_iommu in iommu_fwspec_init



On 2020/5/27 下午5:01, Greg Kroah-Hartman wrote:
> On Tue, May 26, 2020 at 07:49:09PM +0800, Zhangfei Gao wrote:
>> Calling pci_fixup_iommu in iommu_fwspec_init, which alloc
>> iommu_fwnode. Some platform devices appear as PCI but are
>> actually on the AMBA bus, and they need fixup in
>> drivers/pci/quirks.c handling iommu_fwnode.
>> So calling pci_fixup_iommu after iommu_fwnode is allocated.
>>
>> Signed-off-by: Zhangfei Gao <[email protected]>
>> ---
>> drivers/iommu/iommu.c | 4 ++++
>> 1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>> index 7b37542..fb84c42 100644
>> --- a/drivers/iommu/iommu.c
>> +++ b/drivers/iommu/iommu.c
>> @@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode,
>> fwspec->iommu_fwnode = iommu_fwnode;
>> fwspec->ops = ops;
>> dev_iommu_fwspec_set(dev, fwspec);
>> +
>> + if (dev_is_pci(dev))
>> + pci_fixup_device(pci_fixup_iommu, to_pci_dev(dev));
> Why can't the caller do this as it "knows" it is a PCI device at that
> point in time, right?
Putting fixup here is because
1. iommu_fwspec has been allocated
2. iommu_fwspec_init will be called by of_pci_iommu_init and
iort_pci_iommu_init, covering both acpi and dt

Thanks

2020-05-28 07:34:38

by Joerg Roedel

[permalink] [raw]
Subject: Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU

On Wed, May 27, 2020 at 01:18:42PM -0500, Bjorn Helgaas wrote:
> Is this slowdown significant? We already iterate over every device
> when applying PCI_FIXUP_FINAL quirks, so if we used the existing
> PCI_FIXUP_FINAL, we wouldn't be adding a new loop. We would only be
> adding two more iterations to the loop in pci_do_fixups() that tries
> to match quirks against the current device. I doubt that would be a
> measurable slowdown.

I don't know how significant it is, but I remember people complaining
about adding new PCI quirks because it takes too long for them to run
them all. That was in the discussion about the quirk disabling ATS on
AMD Stoney systems.

So it probably depends on how many PCI devices are in the system whether
it causes any measureable slowdown.

Regards,

Joerg

2020-06-01 17:42:50

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU

On Thu, May 28, 2020 at 09:33:44AM +0200, Joerg Roedel wrote:
> On Wed, May 27, 2020 at 01:18:42PM -0500, Bjorn Helgaas wrote:
> > Is this slowdown significant? We already iterate over every device
> > when applying PCI_FIXUP_FINAL quirks, so if we used the existing
> > PCI_FIXUP_FINAL, we wouldn't be adding a new loop. We would only be
> > adding two more iterations to the loop in pci_do_fixups() that tries
> > to match quirks against the current device. I doubt that would be a
> > measurable slowdown.
>
> I don't know how significant it is, but I remember people complaining
> about adding new PCI quirks because it takes too long for them to run
> them all. That was in the discussion about the quirk disabling ATS on
> AMD Stoney systems.
>
> So it probably depends on how many PCI devices are in the system whether
> it causes any measureable slowdown.

I found this [1] from Paul Menzel, which was a slowdown caused by
quirk_usb_early_handoff(). I think the real problem is individual
quirks that take a long time.

The PCI_FIXUP_IOMMU things we're talking about should be fast, and of
course, they're only run for matching devices anyway. So I'd rather
keep them as PCI_FIXUP_FINAL than add a whole new phase.

Bjorn

[1] https://lore.kernel.org/linux-pci/[email protected]/

2020-06-04 13:35:22

by zhangfei

[permalink] [raw]
Subject: Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU



On 2020/6/2 上午1:41, Bjorn Helgaas wrote:
> On Thu, May 28, 2020 at 09:33:44AM +0200, Joerg Roedel wrote:
>> On Wed, May 27, 2020 at 01:18:42PM -0500, Bjorn Helgaas wrote:
>>> Is this slowdown significant? We already iterate over every device
>>> when applying PCI_FIXUP_FINAL quirks, so if we used the existing
>>> PCI_FIXUP_FINAL, we wouldn't be adding a new loop. We would only be
>>> adding two more iterations to the loop in pci_do_fixups() that tries
>>> to match quirks against the current device. I doubt that would be a
>>> measurable slowdown.
>> I don't know how significant it is, but I remember people complaining
>> about adding new PCI quirks because it takes too long for them to run
>> them all. That was in the discussion about the quirk disabling ATS on
>> AMD Stoney systems.
>>
>> So it probably depends on how many PCI devices are in the system whether
>> it causes any measureable slowdown.
> I found this [1] from Paul Menzel, which was a slowdown caused by
> quirk_usb_early_handoff(). I think the real problem is individual
> quirks that take a long time.
>
> The PCI_FIXUP_IOMMU things we're talking about should be fast, and of
> course, they're only run for matching devices anyway. So I'd rather
> keep them as PCI_FIXUP_FINAL than add a whole new phase.
>
Thanks Bjorn for taking time for this.
If so, it would be much simpler.

+++ b/drivers/iommu/iommu.c
@@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct
fwnode_handle *iommu_fwnode,
        fwspec->iommu_fwnode = iommu_fwnode;
        fwspec->ops = ops;
        dev_iommu_fwspec_set(dev, fwspec);
+
+       if (dev_is_pci(dev))
+               pci_fixup_device(pci_fixup_final, to_pci_dev(dev));
+

Then pci_fixup_final will be called twice, the first in pci_bus_add_device.
Here in iommu_fwspec_init is the second time, specifically for iommu_fwspec.
Will send this when 5.8-rc1 is open.

Thanks

2020-06-05 23:19:50

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU

On Thu, Jun 04, 2020 at 09:33:07PM +0800, Zhangfei Gao wrote:
> On 2020/6/2 上午1:41, Bjorn Helgaas wrote:
> > On Thu, May 28, 2020 at 09:33:44AM +0200, Joerg Roedel wrote:
> > > On Wed, May 27, 2020 at 01:18:42PM -0500, Bjorn Helgaas wrote:
> > > > Is this slowdown significant? We already iterate over every device
> > > > when applying PCI_FIXUP_FINAL quirks, so if we used the existing
> > > > PCI_FIXUP_FINAL, we wouldn't be adding a new loop. We would only be
> > > > adding two more iterations to the loop in pci_do_fixups() that tries
> > > > to match quirks against the current device. I doubt that would be a
> > > > measurable slowdown.
> > > I don't know how significant it is, but I remember people complaining
> > > about adding new PCI quirks because it takes too long for them to run
> > > them all. That was in the discussion about the quirk disabling ATS on
> > > AMD Stoney systems.
> > >
> > > So it probably depends on how many PCI devices are in the system whether
> > > it causes any measureable slowdown.
> > I found this [1] from Paul Menzel, which was a slowdown caused by
> > quirk_usb_early_handoff(). I think the real problem is individual
> > quirks that take a long time.
> >
> > The PCI_FIXUP_IOMMU things we're talking about should be fast, and of
> > course, they're only run for matching devices anyway. So I'd rather
> > keep them as PCI_FIXUP_FINAL than add a whole new phase.
> >
> Thanks Bjorn for taking time for this.
> If so, it would be much simpler.
>
> +++ b/drivers/iommu/iommu.c
> @@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct
> fwnode_handle *iommu_fwnode,
>         fwspec->iommu_fwnode = iommu_fwnode;
>         fwspec->ops = ops;
>         dev_iommu_fwspec_set(dev, fwspec);
> +
> +       if (dev_is_pci(dev))
> +               pci_fixup_device(pci_fixup_final, to_pci_dev(dev));
> +
>
> Then pci_fixup_final will be called twice, the first in pci_bus_add_device.
> Here in iommu_fwspec_init is the second time, specifically for iommu_fwspec.
> Will send this when 5.8-rc1 is open.

Wait, this whole fixup approach seems wrong to me. No matter how you
do the fixup, it's still a fixup, which means it requires ongoing
maintenance. Surely we don't want to have to add the Vendor/Device ID
for every new AMBA device that comes along, do we?

Bjorn

2020-06-08 02:55:49

by zhangfei

[permalink] [raw]
Subject: Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU

Hi, Bjorn

On 2020/6/6 上午7:19, Bjorn Helgaas wrote:
> On Thu, Jun 04, 2020 at 09:33:07PM +0800, Zhangfei Gao wrote:
>> On 2020/6/2 上午1:41, Bjorn Helgaas wrote:
>>> On Thu, May 28, 2020 at 09:33:44AM +0200, Joerg Roedel wrote:
>>>> On Wed, May 27, 2020 at 01:18:42PM -0500, Bjorn Helgaas wrote:
>>>>> Is this slowdown significant? We already iterate over every device
>>>>> when applying PCI_FIXUP_FINAL quirks, so if we used the existing
>>>>> PCI_FIXUP_FINAL, we wouldn't be adding a new loop. We would only be
>>>>> adding two more iterations to the loop in pci_do_fixups() that tries
>>>>> to match quirks against the current device. I doubt that would be a
>>>>> measurable slowdown.
>>>> I don't know how significant it is, but I remember people complaining
>>>> about adding new PCI quirks because it takes too long for them to run
>>>> them all. That was in the discussion about the quirk disabling ATS on
>>>> AMD Stoney systems.
>>>>
>>>> So it probably depends on how many PCI devices are in the system whether
>>>> it causes any measureable slowdown.
>>> I found this [1] from Paul Menzel, which was a slowdown caused by
>>> quirk_usb_early_handoff(). I think the real problem is individual
>>> quirks that take a long time.
>>>
>>> The PCI_FIXUP_IOMMU things we're talking about should be fast, and of
>>> course, they're only run for matching devices anyway. So I'd rather
>>> keep them as PCI_FIXUP_FINAL than add a whole new phase.
>>>
>> Thanks Bjorn for taking time for this.
>> If so, it would be much simpler.
>>
>> +++ b/drivers/iommu/iommu.c
>> @@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct
>> fwnode_handle *iommu_fwnode,
>>         fwspec->iommu_fwnode = iommu_fwnode;
>>         fwspec->ops = ops;
>>         dev_iommu_fwspec_set(dev, fwspec);
>> +
>> +       if (dev_is_pci(dev))
>> +               pci_fixup_device(pci_fixup_final, to_pci_dev(dev));
>> +
>>
>> Then pci_fixup_final will be called twice, the first in pci_bus_add_device.
>> Here in iommu_fwspec_init is the second time, specifically for iommu_fwspec.
>> Will send this when 5.8-rc1 is open.
> Wait, this whole fixup approach seems wrong to me. No matter how you
> do the fixup, it's still a fixup, which means it requires ongoing
> maintenance. Surely we don't want to have to add the Vendor/Device ID
> for every new AMBA device that comes along, do we?
>
>
Here the fake pci device has standard PCI cfg space, but physical
implementation is base on AMBA
They can provide pasid feature.
However,
1, does not support tlp since they are not real pci devices.
2. does not support pri, instead support stall (provided by smmu)
And stall is not a pci feature, so it is not described in struct
pci_dev, but in struct iommu_fwspec.
So we use this fixup to tell pci system that the devices can support
stall, and hereby support pasid.

Thanks

2020-06-08 16:43:17

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU

On Mon, Jun 08, 2020 at 10:54:15AM +0800, Zhangfei Gao wrote:
> On 2020/6/6 上午7:19, Bjorn Helgaas wrote:
> > On Thu, Jun 04, 2020 at 09:33:07PM +0800, Zhangfei Gao wrote:
> > > On 2020/6/2 上午1:41, Bjorn Helgaas wrote:
> > > > On Thu, May 28, 2020 at 09:33:44AM +0200, Joerg Roedel wrote:
> > > > > On Wed, May 27, 2020 at 01:18:42PM -0500, Bjorn Helgaas wrote:
> > > > > > Is this slowdown significant? We already iterate over every device
> > > > > > when applying PCI_FIXUP_FINAL quirks, so if we used the existing
> > > > > > PCI_FIXUP_FINAL, we wouldn't be adding a new loop. We would only be
> > > > > > adding two more iterations to the loop in pci_do_fixups() that tries
> > > > > > to match quirks against the current device. I doubt that would be a
> > > > > > measurable slowdown.
> > > > > I don't know how significant it is, but I remember people complaining
> > > > > about adding new PCI quirks because it takes too long for them to run
> > > > > them all. That was in the discussion about the quirk disabling ATS on
> > > > > AMD Stoney systems.
> > > > >
> > > > > So it probably depends on how many PCI devices are in the system whether
> > > > > it causes any measureable slowdown.
> > > > I found this [1] from Paul Menzel, which was a slowdown caused by
> > > > quirk_usb_early_handoff(). I think the real problem is individual
> > > > quirks that take a long time.
> > > >
> > > > The PCI_FIXUP_IOMMU things we're talking about should be fast, and of
> > > > course, they're only run for matching devices anyway. So I'd rather
> > > > keep them as PCI_FIXUP_FINAL than add a whole new phase.
> > > >
> > > Thanks Bjorn for taking time for this.
> > > If so, it would be much simpler.
> > >
> > > +++ b/drivers/iommu/iommu.c
> > > @@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct
> > > fwnode_handle *iommu_fwnode,
> > >         fwspec->iommu_fwnode = iommu_fwnode;
> > >         fwspec->ops = ops;
> > >         dev_iommu_fwspec_set(dev, fwspec);
> > > +
> > > +       if (dev_is_pci(dev))
> > > +               pci_fixup_device(pci_fixup_final, to_pci_dev(dev));
> > > +
> > >
> > > Then pci_fixup_final will be called twice, the first in pci_bus_add_device.
> > > Here in iommu_fwspec_init is the second time, specifically for iommu_fwspec.
> > > Will send this when 5.8-rc1 is open.
> >
> > Wait, this whole fixup approach seems wrong to me. No matter how you
> > do the fixup, it's still a fixup, which means it requires ongoing
> > maintenance. Surely we don't want to have to add the Vendor/Device ID
> > for every new AMBA device that comes along, do we?
> >
> Here the fake pci device has standard PCI cfg space, but physical
> implementation is base on AMBA
> They can provide pasid feature.
> However,
> 1, does not support tlp since they are not real pci devices.
> 2. does not support pri, instead support stall (provided by smmu)
> And stall is not a pci feature, so it is not described in struct pci_dev,
> but in struct iommu_fwspec.
> So we use this fixup to tell pci system that the devices can support stall,
> and hereby support pasid.

This did not answer my question. Are you proposing that we update a
quirk every time a new AMBA device is released? I don't think that
would be a good model.

2020-06-09 04:03:07

by zhangfei

[permalink] [raw]
Subject: Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU

Hi, Bjorn

On 2020/6/9 上午12:41, Bjorn Helgaas wrote:
> On Mon, Jun 08, 2020 at 10:54:15AM +0800, Zhangfei Gao wrote:
>> On 2020/6/6 上午7:19, Bjorn Helgaas wrote:
>>> On Thu, Jun 04, 2020 at 09:33:07PM +0800, Zhangfei Gao wrote:
>>>> On 2020/6/2 上午1:41, Bjorn Helgaas wrote:
>>>>> On Thu, May 28, 2020 at 09:33:44AM +0200, Joerg Roedel wrote:
>>>>>> On Wed, May 27, 2020 at 01:18:42PM -0500, Bjorn Helgaas wrote:
>>>>>>> Is this slowdown significant? We already iterate over every device
>>>>>>> when applying PCI_FIXUP_FINAL quirks, so if we used the existing
>>>>>>> PCI_FIXUP_FINAL, we wouldn't be adding a new loop. We would only be
>>>>>>> adding two more iterations to the loop in pci_do_fixups() that tries
>>>>>>> to match quirks against the current device. I doubt that would be a
>>>>>>> measurable slowdown.
>>>>>> I don't know how significant it is, but I remember people complaining
>>>>>> about adding new PCI quirks because it takes too long for them to run
>>>>>> them all. That was in the discussion about the quirk disabling ATS on
>>>>>> AMD Stoney systems.
>>>>>>
>>>>>> So it probably depends on how many PCI devices are in the system whether
>>>>>> it causes any measureable slowdown.
>>>>> I found this [1] from Paul Menzel, which was a slowdown caused by
>>>>> quirk_usb_early_handoff(). I think the real problem is individual
>>>>> quirks that take a long time.
>>>>>
>>>>> The PCI_FIXUP_IOMMU things we're talking about should be fast, and of
>>>>> course, they're only run for matching devices anyway. So I'd rather
>>>>> keep them as PCI_FIXUP_FINAL than add a whole new phase.
>>>>>
>>>> Thanks Bjorn for taking time for this.
>>>> If so, it would be much simpler.
>>>>
>>>> +++ b/drivers/iommu/iommu.c
>>>> @@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct
>>>> fwnode_handle *iommu_fwnode,
>>>>         fwspec->iommu_fwnode = iommu_fwnode;
>>>>         fwspec->ops = ops;
>>>>         dev_iommu_fwspec_set(dev, fwspec);
>>>> +
>>>> +       if (dev_is_pci(dev))
>>>> +               pci_fixup_device(pci_fixup_final, to_pci_dev(dev));
>>>> +
>>>>
>>>> Then pci_fixup_final will be called twice, the first in pci_bus_add_device.
>>>> Here in iommu_fwspec_init is the second time, specifically for iommu_fwspec.
>>>> Will send this when 5.8-rc1 is open.
>>> Wait, this whole fixup approach seems wrong to me. No matter how you
>>> do the fixup, it's still a fixup, which means it requires ongoing
>>> maintenance. Surely we don't want to have to add the Vendor/Device ID
>>> for every new AMBA device that comes along, do we?
>>>
>> Here the fake pci device has standard PCI cfg space, but physical
>> implementation is base on AMBA
>> They can provide pasid feature.
>> However,
>> 1, does not support tlp since they are not real pci devices.
>> 2. does not support pri, instead support stall (provided by smmu)
>> And stall is not a pci feature, so it is not described in struct pci_dev,
>> but in struct iommu_fwspec.
>> So we use this fixup to tell pci system that the devices can support stall,
>> and hereby support pasid.
> This did not answer my question. Are you proposing that we update a
> quirk every time a new AMBA device is released? I don't think that
> would be a good model.
Yes, you are right, but we do not have any better idea yet.
Currently we have three fake pci devices, which support stall and pasid.
We have to let pci system know the device can support pasid, because of
stall feature, though not support pri.
Do you have any other ideas?

Thanks

2020-06-09 09:17:56

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU

On Tue, Jun 9, 2020 at 6:02 AM Zhangfei Gao <[email protected]> wrote:
> On 2020/6/9 上午12:41, Bjorn Helgaas wrote:
> > On Mon, Jun 08, 2020 at 10:54:15AM +0800, Zhangfei Gao wrote:
> >> On 2020/6/6 上午7:19, Bjorn Helgaas wrote:
> >>>> +++ b/drivers/iommu/iommu.c
> >>>> @@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct
> >>>> fwnode_handle *iommu_fwnode,
> >>>> fwspec->iommu_fwnode = iommu_fwnode;
> >>>> fwspec->ops = ops;
> >>>> dev_iommu_fwspec_set(dev, fwspec);
> >>>> +
> >>>> + if (dev_is_pci(dev))
> >>>> + pci_fixup_device(pci_fixup_final, to_pci_dev(dev));
> >>>> +
> >>>>
> >>>> Then pci_fixup_final will be called twice, the first in pci_bus_add_device.
> >>>> Here in iommu_fwspec_init is the second time, specifically for iommu_fwspec.
> >>>> Will send this when 5.8-rc1 is open.
> >>> Wait, this whole fixup approach seems wrong to me. No matter how you
> >>> do the fixup, it's still a fixup, which means it requires ongoing
> >>> maintenance. Surely we don't want to have to add the Vendor/Device ID
> >>> for every new AMBA device that comes along, do we?
> >>>
> >> Here the fake pci device has standard PCI cfg space, but physical
> >> implementation is base on AMBA
> >> They can provide pasid feature.
> >> However,
> >> 1, does not support tlp since they are not real pci devices.
> >> 2. does not support pri, instead support stall (provided by smmu)
> >> And stall is not a pci feature, so it is not described in struct pci_dev,
> >> but in struct iommu_fwspec.
> >> So we use this fixup to tell pci system that the devices can support stall,
> >> and hereby support pasid.
> > This did not answer my question. Are you proposing that we update a
> > quirk every time a new AMBA device is released? I don't think that
> > would be a good model.
>
> Yes, you are right, but we do not have any better idea yet.
> Currently we have three fake pci devices, which support stall and pasid.
> We have to let pci system know the device can support pasid, because of
> stall feature, though not support pri.
> Do you have any other ideas?

It sounds like the best way would be to allocate a PCI capability for it, so
detection can be done through config space, at least in future devices,
or possibly after a firmware update if the config space in your system
is controlled by firmware somewhere. Once there is a proper mechanism
to do this, using fixups to detect the early devices that don't use that
should be uncontroversial. I have no idea what the process or timeline
is to add new capabilities into the PCIe specification, or if this one
would be acceptable to the PCI SIG at all.

If detection cannot be done through PCI config space, the next best
alternative is to pass auxiliary data through firmware. On DT based
machines, you can list non-hotpluggable PCIe devices and add custom
properties that could be read during device enumeration. I assume
ACPI has something similar, but I have not done that.

Arnd

2020-06-09 16:50:22

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU

On Tue, Jun 09, 2020 at 11:15:06AM +0200, Arnd Bergmann wrote:
> On Tue, Jun 9, 2020 at 6:02 AM Zhangfei Gao <[email protected]> wrote:
> > On 2020/6/9 上午12:41, Bjorn Helgaas wrote:
> > > On Mon, Jun 08, 2020 at 10:54:15AM +0800, Zhangfei Gao wrote:
> > >> On 2020/6/6 上午7:19, Bjorn Helgaas wrote:
> > >>>> +++ b/drivers/iommu/iommu.c
> > >>>> @@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct
> > >>>> fwnode_handle *iommu_fwnode,
> > >>>> fwspec->iommu_fwnode = iommu_fwnode;
> > >>>> fwspec->ops = ops;
> > >>>> dev_iommu_fwspec_set(dev, fwspec);
> > >>>> +
> > >>>> + if (dev_is_pci(dev))
> > >>>> + pci_fixup_device(pci_fixup_final, to_pci_dev(dev));
> > >>>> +
> > >>>>
> > >>>> Then pci_fixup_final will be called twice, the first in pci_bus_add_device.
> > >>>> Here in iommu_fwspec_init is the second time, specifically for iommu_fwspec.
> > >>>> Will send this when 5.8-rc1 is open.
> > >>> Wait, this whole fixup approach seems wrong to me. No matter how you
> > >>> do the fixup, it's still a fixup, which means it requires ongoing
> > >>> maintenance. Surely we don't want to have to add the Vendor/Device ID
> > >>> for every new AMBA device that comes along, do we?
> > >>>
> > >> Here the fake pci device has standard PCI cfg space, but physical
> > >> implementation is base on AMBA
> > >> They can provide pasid feature.
> > >> However,
> > >> 1, does not support tlp since they are not real pci devices.
> > >> 2. does not support pri, instead support stall (provided by smmu)
> > >> And stall is not a pci feature, so it is not described in struct pci_dev,
> > >> but in struct iommu_fwspec.
> > >> So we use this fixup to tell pci system that the devices can support stall,
> > >> and hereby support pasid.
> > > This did not answer my question. Are you proposing that we update a
> > > quirk every time a new AMBA device is released? I don't think that
> > > would be a good model.
> >
> > Yes, you are right, but we do not have any better idea yet.
> > Currently we have three fake pci devices, which support stall and pasid.
> > We have to let pci system know the device can support pasid, because of
> > stall feature, though not support pri.
> > Do you have any other ideas?
>
> It sounds like the best way would be to allocate a PCI capability for it, so
> detection can be done through config space, at least in future devices,
> or possibly after a firmware update if the config space in your system
> is controlled by firmware somewhere. Once there is a proper mechanism
> to do this, using fixups to detect the early devices that don't use that
> should be uncontroversial. I have no idea what the process or timeline
> is to add new capabilities into the PCIe specification, or if this one
> would be acceptable to the PCI SIG at all.

That sounds like a possibility. The spec already defines a
Vendor-Specific Extended Capability (PCIe r5.0, sec 7.9.5) that might
be a candidate.

> If detection cannot be done through PCI config space, the next best
> alternative is to pass auxiliary data through firmware. On DT based
> machines, you can list non-hotpluggable PCIe devices and add custom
> properties that could be read during device enumeration. I assume
> ACPI has something similar, but I have not done that.

ACPI has _DSM (ACPI v6.3, sec 9.1.1), which might be a candidate. I
like this better than a PCI capability because the property you need
to expose is not a PCI property.

2020-06-11 02:57:32

by zhangfei

[permalink] [raw]
Subject: Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU



On 2020/6/10 上午12:49, Bjorn Helgaas wrote:
> On Tue, Jun 09, 2020 at 11:15:06AM +0200, Arnd Bergmann wrote:
>> On Tue, Jun 9, 2020 at 6:02 AM Zhangfei Gao <[email protected]> wrote:
>>> On 2020/6/9 上午12:41, Bjorn Helgaas wrote:
>>>> On Mon, Jun 08, 2020 at 10:54:15AM +0800, Zhangfei Gao wrote:
>>>>> On 2020/6/6 上午7:19, Bjorn Helgaas wrote:
>>>>>>> +++ b/drivers/iommu/iommu.c
>>>>>>> @@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct
>>>>>>> fwnode_handle *iommu_fwnode,
>>>>>>> fwspec->iommu_fwnode = iommu_fwnode;
>>>>>>> fwspec->ops = ops;
>>>>>>> dev_iommu_fwspec_set(dev, fwspec);
>>>>>>> +
>>>>>>> + if (dev_is_pci(dev))
>>>>>>> + pci_fixup_device(pci_fixup_final, to_pci_dev(dev));
>>>>>>> +
>>>>>>>
>>>>>>> Then pci_fixup_final will be called twice, the first in pci_bus_add_device.
>>>>>>> Here in iommu_fwspec_init is the second time, specifically for iommu_fwspec.
>>>>>>> Will send this when 5.8-rc1 is open.
>>>>>> Wait, this whole fixup approach seems wrong to me. No matter how you
>>>>>> do the fixup, it's still a fixup, which means it requires ongoing
>>>>>> maintenance. Surely we don't want to have to add the Vendor/Device ID
>>>>>> for every new AMBA device that comes along, do we?
>>>>>>
>>>>> Here the fake pci device has standard PCI cfg space, but physical
>>>>> implementation is base on AMBA
>>>>> They can provide pasid feature.
>>>>> However,
>>>>> 1, does not support tlp since they are not real pci devices.
>>>>> 2. does not support pri, instead support stall (provided by smmu)
>>>>> And stall is not a pci feature, so it is not described in struct pci_dev,
>>>>> but in struct iommu_fwspec.
>>>>> So we use this fixup to tell pci system that the devices can support stall,
>>>>> and hereby support pasid.
>>>> This did not answer my question. Are you proposing that we update a
>>>> quirk every time a new AMBA device is released? I don't think that
>>>> would be a good model.
>>> Yes, you are right, but we do not have any better idea yet.
>>> Currently we have three fake pci devices, which support stall and pasid.
>>> We have to let pci system know the device can support pasid, because of
>>> stall feature, though not support pri.
>>> Do you have any other ideas?
>> It sounds like the best way would be to allocate a PCI capability for it, so
>> detection can be done through config space, at least in future devices,
>> or possibly after a firmware update if the config space in your system
>> is controlled by firmware somewhere. Once there is a proper mechanism
>> to do this, using fixups to detect the early devices that don't use that
>> should be uncontroversial. I have no idea what the process or timeline
>> is to add new capabilities into the PCIe specification, or if this one
>> would be acceptable to the PCI SIG at all.
> That sounds like a possibility. The spec already defines a
> Vendor-Specific Extended Capability (PCIe r5.0, sec 7.9.5) that might
> be a candidate.
Will investigate this, thanks Bjorn
>
>> If detection cannot be done through PCI config space, the next best
>> alternative is to pass auxiliary data through firmware. On DT based
>> machines, you can list non-hotpluggable PCIe devices and add custom
>> properties that could be read during device enumeration. I assume
>> ACPI has something similar, but I have not done that.
Yes, thanks Arnd
> ACPI has _DSM (ACPI v6.3, sec 9.1.1), which might be a candidate. I
> like this better than a PCI capability because the property you need
> to expose is not a PCI property.
_DSM may not workable, since it is working in runtime.
We need stall information in init stage, neither too early (after
allocation of iommu_fwspec)
nor too late (before arm_smmu_add_device ).

By the way,
It would be a long time if we need modify either pcie spec or acpi spec.
Can we use pci_fixup_device in iommu_fwspec_init first, it is relatively
simple
and meet the requirement of platform device using pasid, and they are
already in product.

Thanks

2020-06-11 13:48:30

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU

On Thu, Jun 11, 2020 at 10:54:45AM +0800, Zhangfei Gao wrote:
> On 2020/6/10 上午12:49, Bjorn Helgaas wrote:
> > On Tue, Jun 09, 2020 at 11:15:06AM +0200, Arnd Bergmann wrote:
> > > On Tue, Jun 9, 2020 at 6:02 AM Zhangfei Gao <[email protected]> wrote:
> > > > On 2020/6/9 上午12:41, Bjorn Helgaas wrote:
> > > > > On Mon, Jun 08, 2020 at 10:54:15AM +0800, Zhangfei Gao wrote:
> > > > > > On 2020/6/6 上午7:19, Bjorn Helgaas wrote:
> > > > > > > > +++ b/drivers/iommu/iommu.c
> > > > > > > > @@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct
> > > > > > > > fwnode_handle *iommu_fwnode,
> > > > > > > > fwspec->iommu_fwnode = iommu_fwnode;
> > > > > > > > fwspec->ops = ops;
> > > > > > > > dev_iommu_fwspec_set(dev, fwspec);
> > > > > > > > +
> > > > > > > > + if (dev_is_pci(dev))
> > > > > > > > + pci_fixup_device(pci_fixup_final, to_pci_dev(dev));
> > > > > > > > +
> > > > > > > >
> > > > > > > > Then pci_fixup_final will be called twice, the first in pci_bus_add_device.
> > > > > > > > Here in iommu_fwspec_init is the second time, specifically for iommu_fwspec.
> > > > > > > > Will send this when 5.8-rc1 is open.
> > > > > > > Wait, this whole fixup approach seems wrong to me. No matter how you
> > > > > > > do the fixup, it's still a fixup, which means it requires ongoing
> > > > > > > maintenance. Surely we don't want to have to add the Vendor/Device ID
> > > > > > > for every new AMBA device that comes along, do we?
> > > > > > >
> > > > > > Here the fake pci device has standard PCI cfg space, but physical
> > > > > > implementation is base on AMBA
> > > > > > They can provide pasid feature.
> > > > > > However,
> > > > > > 1, does not support tlp since they are not real pci devices.
> > > > > > 2. does not support pri, instead support stall (provided by smmu)
> > > > > > And stall is not a pci feature, so it is not described in struct pci_dev,
> > > > > > but in struct iommu_fwspec.
> > > > > > So we use this fixup to tell pci system that the devices can support stall,
> > > > > > and hereby support pasid.
> > > > > This did not answer my question. Are you proposing that we update a
> > > > > quirk every time a new AMBA device is released? I don't think that
> > > > > would be a good model.
> > > > Yes, you are right, but we do not have any better idea yet.
> > > > Currently we have three fake pci devices, which support stall and pasid.
> > > > We have to let pci system know the device can support pasid, because of
> > > > stall feature, though not support pri.
> > > > Do you have any other ideas?
> > > It sounds like the best way would be to allocate a PCI capability for it, so
> > > detection can be done through config space, at least in future devices,
> > > or possibly after a firmware update if the config space in your system
> > > is controlled by firmware somewhere. Once there is a proper mechanism
> > > to do this, using fixups to detect the early devices that don't use that
> > > should be uncontroversial. I have no idea what the process or timeline
> > > is to add new capabilities into the PCIe specification, or if this one
> > > would be acceptable to the PCI SIG at all.
> > That sounds like a possibility. The spec already defines a
> > Vendor-Specific Extended Capability (PCIe r5.0, sec 7.9.5) that might
> > be a candidate.
> Will investigate this, thanks Bjorn

FWIW, there's also a Vendor-Specific Capability that can appear in the
first 256 bytes of config space (the Vendor-Specific Extended
Capability must appear in the "Extended Configuration Space" from
0x100-0xfff).

> > > If detection cannot be done through PCI config space, the next best
> > > alternative is to pass auxiliary data through firmware. On DT based
> > > machines, you can list non-hotpluggable PCIe devices and add custom
> > > properties that could be read during device enumeration. I assume
> > > ACPI has something similar, but I have not done that.
> Yes, thanks Arnd
> > ACPI has _DSM (ACPI v6.3, sec 9.1.1), which might be a candidate. I
> > like this better than a PCI capability because the property you need
> > to expose is not a PCI property.
> _DSM may not workable, since it is working in runtime.
> We need stall information in init stage, neither too early (after allocation
> of iommu_fwspec)
> nor too late (before arm_smmu_add_device ).

I'm not aware of a restriction on when _DSM can be evaluated. I'm
looking at ACPI v6.3, sec 9.1.1. Are you seeing something different?

> By the way, It would be a long time if we need modify either pcie
> spec or acpi spec. Can we use pci_fixup_device in iommu_fwspec_init
> first, it is relatively simple and meet the requirement of platform
> device using pasid, and they are already in product.

Neither the PCI Vendor-Specific Capability nor the ACPI _DSM requires
a spec change. Both can be completely vendor-defined.

2020-06-13 14:32:08

by zhangfei

[permalink] [raw]
Subject: Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU



On 2020/6/11 下午9:44, Bjorn Helgaas wrote:
> +++ b/drivers/iommu/iommu.c
>>>>>>>>> @@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct
>>>>>>>>> fwnode_handle *iommu_fwnode,
>>>>>>>>> fwspec->iommu_fwnode = iommu_fwnode;
>>>>>>>>> fwspec->ops = ops;
>>>>>>>>> dev_iommu_fwspec_set(dev, fwspec);
>>>>>>>>> +
>>>>>>>>> + if (dev_is_pci(dev))
>>>>>>>>> + pci_fixup_device(pci_fixup_final, to_pci_dev(dev));
>>>>>>>>> +
>>>>>>>>>
>>>>>>>>> Then pci_fixup_final will be called twice, the first in pci_bus_add_device.
>>>>>>>>> Here in iommu_fwspec_init is the second time, specifically for iommu_fwspec.
>>>>>>>>> Will send this when 5.8-rc1 is open.
>>>>>>>> Wait, this whole fixup approach seems wrong to me. No matter how you
>>>>>>>> do the fixup, it's still a fixup, which means it requires ongoing
>>>>>>>> maintenance. Surely we don't want to have to add the Vendor/Device ID
>>>>>>>> for every new AMBA device that comes along, do we?
>>>>>>>>
>>>>>>> Here the fake pci device has standard PCI cfg space, but physical
>>>>>>> implementation is base on AMBA
>>>>>>> They can provide pasid feature.
>>>>>>> However,
>>>>>>> 1, does not support tlp since they are not real pci devices.
>>>>>>> 2. does not support pri, instead support stall (provided by smmu)
>>>>>>> And stall is not a pci feature, so it is not described in struct pci_dev,
>>>>>>> but in struct iommu_fwspec.
>>>>>>> So we use this fixup to tell pci system that the devices can support stall,
>>>>>>> and hereby support pasid.
>>>>>> This did not answer my question. Are you proposing that we update a
>>>>>> quirk every time a new AMBA device is released? I don't think that
>>>>>> would be a good model.
>>>>> Yes, you are right, but we do not have any better idea yet.
>>>>> Currently we have three fake pci devices, which support stall and pasid.
>>>>> We have to let pci system know the device can support pasid, because of
>>>>> stall feature, though not support pri.
>>>>> Do you have any other ideas?
>>>> It sounds like the best way would be to allocate a PCI capability for it, so
>>>> detection can be done through config space, at least in future devices,
>>>> or possibly after a firmware update if the config space in your system
>>>> is controlled by firmware somewhere. Once there is a proper mechanism
>>>> to do this, using fixups to detect the early devices that don't use that
>>>> should be uncontroversial. I have no idea what the process or timeline
>>>> is to add new capabilities into the PCIe specification, or if this one
>>>> would be acceptable to the PCI SIG at all.
>>> That sounds like a possibility. The spec already defines a
>>> Vendor-Specific Extended Capability (PCIe r5.0, sec 7.9.5) that might
>>> be a candidate.
>> Will investigate this, thanks Bjorn
> FWIW, there's also a Vendor-Specific Capability that can appear in the
> first 256 bytes of config space (the Vendor-Specific Extended
> Capability must appear in the "Extended Configuration Space" from
> 0x100-0xfff).
Unfortunately our silicon does not have either Vendor-Specific Capability or
Vendor-Specific Extended Capability.

Studied commit 8531e283bee66050734fb0e89d53e85fd5ce24a4
Looks this method requires adding member (like can_stall) to struct
pci_dev, looks difficult.

>
>>>> If detection cannot be done through PCI config space, the next best
>>>> alternative is to pass auxiliary data through firmware. On DT based
>>>> machines, you can list non-hotpluggable PCIe devices and add custom
>>>> properties that could be read during device enumeration. I assume
>>>> ACPI has something similar, but I have not done that.
>> Yes, thanks Arnd
>>> ACPI has _DSM (ACPI v6.3, sec 9.1.1), which might be a candidate. I
>>> like this better than a PCI capability because the property you need
>>> to expose is not a PCI property.
>> _DSM may not workable, since it is working in runtime.
>> We need stall information in init stage, neither too early (after allocation
>> of iommu_fwspec)
>> nor too late (before arm_smmu_add_device ).
> I'm not aware of a restriction on when _DSM can be evaluated. I'm
> looking at ACPI v6.3, sec 9.1.1. Are you seeing something different?
DSM method seems requires vendor specific guid, and code would be vendor
specific.
Except adding uuid to some spec like pci_acpi_dsm_guid.
obj = acpi_evaluate_dsm(ACPI_HANDLE(bus->bridge), &pci_acpi_dsm_guid, 1,
IGNORE_PCI_BOOT_CONFIG_DSM, NULL);

>> By the way, It would be a long time if we need modify either pcie
>> spec or acpi spec. Can we use pci_fixup_device in iommu_fwspec_init
>> first, it is relatively simple and meet the requirement of platform
>> device using pasid, and they are already in product.
> Neither the PCI Vendor-Specific Capability nor the ACPI _DSM requires
> a spec change. Both can be completely vendor-defined.
Adding vendor-specific code to common files looks a bit ugly.

Thanks

2020-06-15 23:52:52

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU

On Sat, Jun 13, 2020 at 10:30:56PM +0800, Zhangfei Gao wrote:
> On 2020/6/11 下午9:44, Bjorn Helgaas wrote:
> > +++ b/drivers/iommu/iommu.c
> > > > > > > > > > @@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct
> > > > > > > > > > fwnode_handle *iommu_fwnode,
> > > > > > > > > > fwspec->iommu_fwnode = iommu_fwnode;
> > > > > > > > > > fwspec->ops = ops;
> > > > > > > > > > dev_iommu_fwspec_set(dev, fwspec);
> > > > > > > > > > +
> > > > > > > > > > + if (dev_is_pci(dev))
> > > > > > > > > > + pci_fixup_device(pci_fixup_final, to_pci_dev(dev));
> > > > > > > > > > +
> > > > > > > > > >
> > > > > > > > > > Then pci_fixup_final will be called twice, the first in pci_bus_add_device.
> > > > > > > > > > Here in iommu_fwspec_init is the second time, specifically for iommu_fwspec.
> > > > > > > > > > Will send this when 5.8-rc1 is open.
> > > > > > > > > Wait, this whole fixup approach seems wrong to me. No matter how you
> > > > > > > > > do the fixup, it's still a fixup, which means it requires ongoing
> > > > > > > > > maintenance. Surely we don't want to have to add the Vendor/Device ID
> > > > > > > > > for every new AMBA device that comes along, do we?
> > > > > > > > >
> > > > > > > > Here the fake pci device has standard PCI cfg space, but physical
> > > > > > > > implementation is base on AMBA
> > > > > > > > They can provide pasid feature.
> > > > > > > > However,
> > > > > > > > 1, does not support tlp since they are not real pci devices.
> > > > > > > > 2. does not support pri, instead support stall (provided by smmu)
> > > > > > > > And stall is not a pci feature, so it is not described in struct pci_dev,
> > > > > > > > but in struct iommu_fwspec.
> > > > > > > > So we use this fixup to tell pci system that the devices can support stall,
> > > > > > > > and hereby support pasid.
> > > > > > > This did not answer my question. Are you proposing that we update a
> > > > > > > quirk every time a new AMBA device is released? I don't think that
> > > > > > > would be a good model.
> > > > > > Yes, you are right, but we do not have any better idea yet.
> > > > > > Currently we have three fake pci devices, which support stall and pasid.
> > > > > > We have to let pci system know the device can support pasid, because of
> > > > > > stall feature, though not support pri.
> > > > > > Do you have any other ideas?
> > > > > It sounds like the best way would be to allocate a PCI capability for it, so
> > > > > detection can be done through config space, at least in future devices,
> > > > > or possibly after a firmware update if the config space in your system
> > > > > is controlled by firmware somewhere. Once there is a proper mechanism
> > > > > to do this, using fixups to detect the early devices that don't use that
> > > > > should be uncontroversial. I have no idea what the process or timeline
> > > > > is to add new capabilities into the PCIe specification, or if this one
> > > > > would be acceptable to the PCI SIG at all.
> > > > That sounds like a possibility. The spec already defines a
> > > > Vendor-Specific Extended Capability (PCIe r5.0, sec 7.9.5) that might
> > > > be a candidate.
> > > Will investigate this, thanks Bjorn
> > FWIW, there's also a Vendor-Specific Capability that can appear in the
> > first 256 bytes of config space (the Vendor-Specific Extended
> > Capability must appear in the "Extended Configuration Space" from
> > 0x100-0xfff).
> Unfortunately our silicon does not have either Vendor-Specific Capability or
> Vendor-Specific Extended Capability.
>
> Studied commit 8531e283bee66050734fb0e89d53e85fd5ce24a4
> Looks this method requires adding member (like can_stall) to struct pci_dev,
> looks difficult.

The problem is that we don't want to add device IDs every time a new
chip comes out. Adding one or two device IDs for silicon that's
already released is not a problem as long as you have a strategy for
*future* devices so they don't require a quirk.

> > > > > If detection cannot be done through PCI config space, the next best
> > > > > alternative is to pass auxiliary data through firmware. On DT based
> > > > > machines, you can list non-hotpluggable PCIe devices and add custom
> > > > > properties that could be read during device enumeration. I assume
> > > > > ACPI has something similar, but I have not done that.
> > > Yes, thanks Arnd
> > > > ACPI has _DSM (ACPI v6.3, sec 9.1.1), which might be a candidate. I
> > > > like this better than a PCI capability because the property you need
> > > > to expose is not a PCI property.
> > > _DSM may not workable, since it is working in runtime.
> > > We need stall information in init stage, neither too early (after allocation
> > > of iommu_fwspec)
> > > nor too late (before arm_smmu_add_device ).
> > I'm not aware of a restriction on when _DSM can be evaluated. I'm
> > looking at ACPI v6.3, sec 9.1.1. Are you seeing something different?
> DSM method seems requires vendor specific guid, and code would be vendor
> specific.

_DSM indeed requires a vendor-specific UUID, precisely *because*
vendors are free to define their own functionality without requiring
changes to the ACPI spec. From the spec (ACPI v6.3, sec 9.1.1):

New UUIDs may also be created by OEMs and IHVs for custom devices
and other interface or device governing bodies (e.g. the PCI SIG),
as long as the UUID is different from other published UUIDs.

2020-06-19 02:39:17

by zhangfei

[permalink] [raw]
Subject: Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU

Hi, Bjorn

On 2020/6/16 上午7:52, Bjorn Helgaas wrote:
> On Sat, Jun 13, 2020 at 10:30:56PM +0800, Zhangfei Gao wrote:
>> On 2020/6/11 下午9:44, Bjorn Helgaas wrote:
>>> +++ b/drivers/iommu/iommu.c
>>>>>>>>>>> @@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct
>>>>>>>>>>> fwnode_handle *iommu_fwnode,
>>>>>>>>>>> fwspec->iommu_fwnode = iommu_fwnode;
>>>>>>>>>>> fwspec->ops = ops;
>>>>>>>>>>> dev_iommu_fwspec_set(dev, fwspec);
>>>>>>>>>>> +
>>>>>>>>>>> + if (dev_is_pci(dev))
>>>>>>>>>>> + pci_fixup_device(pci_fixup_final, to_pci_dev(dev));
>>>>>>>>>>> +
>>>>>>>>>>>
>>>>>>>>>>> Then pci_fixup_final will be called twice, the first in pci_bus_add_device.
>>>>>>>>>>> Here in iommu_fwspec_init is the second time, specifically for iommu_fwspec.
>>>>>>>>>>> Will send this when 5.8-rc1 is open.
>>>>>>>>>> Wait, this whole fixup approach seems wrong to me. No matter how you
>>>>>>>>>> do the fixup, it's still a fixup, which means it requires ongoing
>>>>>>>>>> maintenance. Surely we don't want to have to add the Vendor/Device ID
>>>>>>>>>> for every new AMBA device that comes along, do we?
>>>>>>>>>>
>>>>>>>>> Here the fake pci device has standard PCI cfg space, but physical
>>>>>>>>> implementation is base on AMBA
>>>>>>>>> They can provide pasid feature.
>>>>>>>>> However,
>>>>>>>>> 1, does not support tlp since they are not real pci devices.
>>>>>>>>> 2. does not support pri, instead support stall (provided by smmu)
>>>>>>>>> And stall is not a pci feature, so it is not described in struct pci_dev,
>>>>>>>>> but in struct iommu_fwspec.
>>>>>>>>> So we use this fixup to tell pci system that the devices can support stall,
>>>>>>>>> and hereby support pasid.
>>>>>>>> This did not answer my question. Are you proposing that we update a
>>>>>>>> quirk every time a new AMBA device is released? I don't think that
>>>>>>>> would be a good model.
>>>>>>> Yes, you are right, but we do not have any better idea yet.
>>>>>>> Currently we have three fake pci devices, which support stall and pasid.
>>>>>>> We have to let pci system know the device can support pasid, because of
>>>>>>> stall feature, though not support pri.
>>>>>>> Do you have any other ideas?
>>>>>> It sounds like the best way would be to allocate a PCI capability for it, so
>>>>>> detection can be done through config space, at least in future devices,
>>>>>> or possibly after a firmware update if the config space in your system
>>>>>> is controlled by firmware somewhere. Once there is a proper mechanism
>>>>>> to do this, using fixups to detect the early devices that don't use that
>>>>>> should be uncontroversial. I have no idea what the process or timeline
>>>>>> is to add new capabilities into the PCIe specification, or if this one
>>>>>> would be acceptable to the PCI SIG at all.
>>>>> That sounds like a possibility. The spec already defines a
>>>>> Vendor-Specific Extended Capability (PCIe r5.0, sec 7.9.5) that might
>>>>> be a candidate.
>>>> Will investigate this, thanks Bjorn
>>> FWIW, there's also a Vendor-Specific Capability that can appear in the
>>> first 256 bytes of config space (the Vendor-Specific Extended
>>> Capability must appear in the "Extended Configuration Space" from
>>> 0x100-0xfff).
>> Unfortunately our silicon does not have either Vendor-Specific Capability or
>> Vendor-Specific Extended Capability.
>>
>> Studied commit 8531e283bee66050734fb0e89d53e85fd5ce24a4
>> Looks this method requires adding member (like can_stall) to struct pci_dev,
>> looks difficult.
> The problem is that we don't want to add device IDs every time a new
> chip comes out. Adding one or two device IDs for silicon that's
> already released is not a problem as long as you have a strategy for
> *future* devices so they don't require a quirk.
>
>>>>>> If detection cannot be done through PCI config space, the next best
>>>>>> alternative is to pass auxiliary data through firmware. On DT based
>>>>>> machines, you can list non-hotpluggable PCIe devices and add custom
>>>>>> properties that could be read during device enumeration. I assume
>>>>>> ACPI has something similar, but I have not done that.
>>>> Yes, thanks Arnd
>>>>> ACPI has _DSM (ACPI v6.3, sec 9.1.1), which might be a candidate. I
>>>>> like this better than a PCI capability because the property you need
>>>>> to expose is not a PCI property.
>>>> _DSM may not workable, since it is working in runtime.
>>>> We need stall information in init stage, neither too early (after allocation
>>>> of iommu_fwspec)
>>>> nor too late (before arm_smmu_add_device ).
>>> I'm not aware of a restriction on when _DSM can be evaluated. I'm
>>> looking at ACPI v6.3, sec 9.1.1. Are you seeing something different?
>> DSM method seems requires vendor specific guid, and code would be vendor
>> specific.
> _DSM indeed requires a vendor-specific UUID, precisely *because*
> vendors are free to define their own functionality without requiring
> changes to the ACPI spec. From the spec (ACPI v6.3, sec 9.1.1):
>
> New UUIDs may also be created by OEMs and IHVs for custom devices
> and other interface or device governing bodies (e.g. the PCI SIG),
> as long as the UUID is different from other published UUIDs.
Have studied _DSM method, two issues we met comparing using quirk.

1. Need change definition of either pci_host_bridge or pci_dev, like
adding member can_stall,
while pci system does not know stall now.

a, pci devices do not have uuid: uuid need be described in dsdt, while
pci devices are not defined in dsdt.
    so we have to use host bridge.

b,  Parsing dsdt is in in pci subsystem.
Like drivers/acpi/pci_root.c:
       obj = acpi_evaluate_dsm(ACPI_HANDLE(bus->bridge),
&pci_acpi_dsm_guid, 1,
                                IGNORE_PCI_BOOT_CONFIG_DSM, NULL);

After parsing DSM in pci, we need record this info.
Currently, can_stall info is recorded in iommu_fwspec,
which is allocated in iommu_fwspec_init and called by
iort_iommu_configure for uefi.

2. Guest kernel also need support sva.
Using quirk, the guest can boot with sva enabled, since quirk is
self-contained by kernel.
If using  _DSM, a specific uefi or dtb has to be provided,
currently we can useQEMU_EFI.fd from apt install qemu-efi


Thanks

2020-06-22 11:54:45

by Joerg Roedel

[permalink] [raw]
Subject: Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU

On Mon, Jun 01, 2020 at 12:41:04PM -0500, Bjorn Helgaas wrote:
> I found this [1] from Paul Menzel, which was a slowdown caused by
> quirk_usb_early_handoff(). I think the real problem is individual
> quirks that take a long time.
>
> The PCI_FIXUP_IOMMU things we're talking about should be fast, and of
> course, they're only run for matching devices anyway. So I'd rather
> keep them as PCI_FIXUP_FINAL than add a whole new phase.

Okay, so if it is not a performance problem, then I am fine with using
PCI_FIXUP_FINAL. But I dislike calling the fixups from IOMMU code, there
must be a better solution.


Regards,

Joerg

> [1] https://lore.kernel.org/linux-pci/[email protected]/

2020-06-22 11:56:16

by Joerg Roedel

[permalink] [raw]
Subject: Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU

On Thu, Jun 04, 2020 at 09:33:07PM +0800, Zhangfei Gao wrote:
> +++ b/drivers/iommu/iommu.c
> @@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct
> fwnode_handle *iommu_fwnode,
> ??????? fwspec->iommu_fwnode = iommu_fwnode;
> ??????? fwspec->ops = ops;
> ??????? dev_iommu_fwspec_set(dev, fwspec);
> +
> +?????? if (dev_is_pci(dev))
> +?????????????? pci_fixup_device(pci_fixup_final, to_pci_dev(dev));
> +

That's not going to fly, I don't think we should run the fixups twice,
and they should not be run from IOMMU code. Is the only reason for this
second pass that iommu_fwspec is not yet allocated when it runs the
first time? I ask because it might be easier to just allocate the struct
earlier then.

Regards,

Joerg

2020-06-23 07:50:01

by zhangfei

[permalink] [raw]
Subject: Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU

Hi, Joerg

On 2020/6/22 下午7:55, Joerg Roedel wrote:
> On Thu, Jun 04, 2020 at 09:33:07PM +0800, Zhangfei Gao wrote:
>> +++ b/drivers/iommu/iommu.c
>> @@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct
>> fwnode_handle *iommu_fwnode,
>>         fwspec->iommu_fwnode = iommu_fwnode;
>>         fwspec->ops = ops;
>>         dev_iommu_fwspec_set(dev, fwspec);
>> +
>> +       if (dev_is_pci(dev))
>> +               pci_fixup_device(pci_fixup_final, to_pci_dev(dev));
>> +
> That's not going to fly, I don't think we should run the fixups twice,
> and they should not be run from IOMMU code. Is the only reason for this
> second pass that iommu_fwspec is not yet allocated when it runs the
> first time? I ask because it might be easier to just allocate the struct
> earlier then.
Thanks for looking this.

Yes, it is the only reason calling fixup secondly after iommu_fwspec is
allocated.

The first time fixup final is very early in pci_bus_add_device.
If allocating iommu_fwspec earlier, it maybe in pci_alloc_dev.
And assigning ops still in iommu_fwspec_init.
Have tested it works.
Not sure is it acceptable?

Alternatively, adding can_stall to struct pci_dev is simple but ugly too,
since pci does not know stall now.


Thanks



2020-06-23 15:07:22

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU

On Fri, Jun 19, 2020 at 10:26:54AM +0800, Zhangfei Gao wrote:
> Have studied _DSM method, two issues we met comparing using quirk.
>
> 1. Need change definition of either pci_host_bridge or pci_dev, like adding
> member can_stall,
> while pci system does not know stall now.
>
> a, pci devices do not have uuid: uuid need be described in dsdt, while pci
> devices are not defined in dsdt.
> ??? so we have to use host bridge.

PCI devices *can* be described in the DSDT. IIUC these particular
devices are hardwired (not plug-in cards), so platform firmware can
know about them and could describe them in the DSDT.

> b,? Parsing dsdt is in in pci subsystem.
> Like drivers/acpi/pci_root.c:
> ?????? obj = acpi_evaluate_dsm(ACPI_HANDLE(bus->bridge), &pci_acpi_dsm_guid,
> 1,
> ??????????????????????????????? IGNORE_PCI_BOOT_CONFIG_DSM, NULL);
>
> After parsing DSM in pci, we need record this info.
> Currently, can_stall info is recorded in iommu_fwspec,
> which is allocated in iommu_fwspec_init and called by iort_iommu_configure
> for uefi.

You can look for a _DSM wherever it is convenient for you. It could
be in an AMBA shim layer.

> 2. Guest kernel also need support sva.
> Using quirk, the guest can boot with sva enabled, since quirk is
> self-contained by kernel.
> If using? _DSM, a specific uefi or dtb has to be provided,
> currently we can useQEMU_EFI.fd from apt install qemu-efi

I don't quite understand what this means, but as I mentioned before, a
quirk for a *limited* number of devices is OK, as long as there is a
plan that removes the need for a quirk for future devices.

E.g., if the next platform version ships with a DTB or firmware with a
_DSM or other mechanism that enables the kernel to discover this
information without a kernel change, it's fine to use a quirk to cover
the early platform.

The principles are:

- I don't want to have to update a quirk for every new Device ID
that needs this.

- I don't really want to have to manage non-PCI information in the
struct pci_dev. If this is AMBA- or IOMMU-related, it should be
stored in a structure related to AMBA or the IOMMU.

2020-12-17 20:39:45

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU

On Wed, Dec 16, 2020 at 07:24:30PM +0800, Zhou Wang wrote:
> On 2020/6/23 23:04, Bjorn Helgaas wrote:
> > On Fri, Jun 19, 2020 at 10:26:54AM +0800, Zhangfei Gao wrote:
> >> Have studied _DSM method, two issues we met comparing using quirk.
> >>
> >> 1. Need change definition of either pci_host_bridge or pci_dev, like adding
> >> member can_stall,
> >> while pci system does not know stall now.
> >>
> >> a, pci devices do not have uuid: uuid need be described in dsdt, while pci
> >> devices are not defined in dsdt.
> >> so we have to use host bridge.
> >
> > PCI devices *can* be described in the DSDT. IIUC these particular
> > devices are hardwired (not plug-in cards), so platform firmware can
> > know about them and could describe them in the DSDT.
> >
> >> b, Parsing dsdt is in in pci subsystem.
> >> Like drivers/acpi/pci_root.c:
> >> obj = acpi_evaluate_dsm(ACPI_HANDLE(bus->bridge), &pci_acpi_dsm_guid,
> >> 1,
> >> IGNORE_PCI_BOOT_CONFIG_DSM, NULL);
> >>
> >> After parsing DSM in pci, we need record this info.
> >> Currently, can_stall info is recorded in iommu_fwspec,
> >> which is allocated in iommu_fwspec_init and called by iort_iommu_configure
> >> for uefi.
> >
> > You can look for a _DSM wherever it is convenient for you. It could
> > be in an AMBA shim layer.
> >
> >> 2. Guest kernel also need support sva.
> >> Using quirk, the guest can boot with sva enabled, since quirk is
> >> self-contained by kernel.
> >> If using _DSM, a specific uefi or dtb has to be provided,
> >> currently we can useQEMU_EFI.fd from apt install qemu-efi
> >
> > I don't quite understand what this means, but as I mentioned before, a
> > quirk for a *limited* number of devices is OK, as long as there is a
> > plan that removes the need for a quirk for future devices.
> >
> > E.g., if the next platform version ships with a DTB or firmware with a
> > _DSM or other mechanism that enables the kernel to discover this
> > information without a kernel change, it's fine to use a quirk to cover
> > the early platform.
> >
> > The principles are:
> >
> > - I don't want to have to update a quirk for every new Device ID
> > that needs this.
>
> Hi Bjorn and Zhangfei,
>
> We plan to use ATS/PRI to support SVA in future PCI devices. However, for
> current devices, we need to add limited number of quirk to let them
> work. The device IDs of current quirk needed devices are ZIP engine(0xa250, 0xa251),
> SEC engine(0xa255, 0xa256), HPRE engine(0xa258, 0xa259), revision id are
> 0x21 and 0x30.
>
> Let's continue to upstream these quirks!

Please post the patches you propose. I don't think the previous ones
are in my queue. Please include the lore URL for the previous
posting(s) in the cover letter so we can connect the discussion.

> > - I don't really want to have to manage non-PCI information in the
> > struct pci_dev. If this is AMBA- or IOMMU-related, it should be
> > stored in a structure related to AMBA or the IOMMU.
> > .
> >