2019-01-25 10:14:29

by Srinath Mannam

[permalink] [raw]
Subject: [PATCH v3 0/3] PCIe Host request to reserve IOVA

Few SOCs have limitation that their PCIe host can't allow few inbound
address ranges. Allowed inbound address ranges are listed in dma-ranges
DT property and this address ranges are required to do IOVA mapping.
Remaining address ranges have to be reserved in IOVA mapping.

PCIe Host driver of those SOCs has to list all address ranges which have
to reserve their IOVA address into PCIe host bridge resource entry list.
IOMMU framework will reserve these IOVAs while initializing IOMMU domain.

This patch set is based on Linux-5.0-rc2.

Changes from v2:
- Patch set rebased to Linux-5.0-rc2

Changes from v1:
- Addressed Oza review comments.

Srinath Mannam (3):
PCI: Add dma-resv window list
iommu/dma: Reserve IOVA for PCI host reserve address list
PCI: iproc: Add dma reserve resources to host

drivers/iommu/dma-iommu.c | 8 ++++++
drivers/pci/controller/pcie-iproc.c | 51 ++++++++++++++++++++++++++++++++++++-
drivers/pci/probe.c | 3 +++
include/linux/pci.h | 1 +
4 files changed, 62 insertions(+), 1 deletion(-)

--
2.7.4



2019-01-25 10:14:38

by Srinath Mannam

[permalink] [raw]
Subject: [PATCH v3 1/3] PCI: Add dma-resv window list

Add a dma_resv parameter in PCI host bridge structure to hold resource
entries list of memory regions for which IOVAs have to reserve.

PCIe host driver will add resource entries to this list based on its
requirements. Few inbound address ranges can't be allowed by few PCIe host,
so those address ranges will be add to this list to avoid IOMMU mapping.

While initializing IOMMU domain of PCI EPs connected to that host bridge
IOVAs for this given list of address ranges will be reserved.

Signed-off-by: Srinath Mannam <[email protected]>
Based-on-patch-by: Oza Pawandeep <[email protected]>
Reviewed-by: Oza Pawandeep <[email protected]>
---
drivers/pci/probe.c | 3 +++
include/linux/pci.h | 1 +
2 files changed, 4 insertions(+)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 257b9f6..fd4b143 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -544,6 +544,7 @@ struct pci_host_bridge *pci_alloc_host_bridge(size_t priv)
return NULL;

INIT_LIST_HEAD(&bridge->windows);
+ INIT_LIST_HEAD(&bridge->dma_resv);
bridge->dev.release = pci_release_host_bridge_dev;

/*
@@ -572,6 +573,7 @@ struct pci_host_bridge *devm_pci_alloc_host_bridge(struct device *dev,
return NULL;

INIT_LIST_HEAD(&bridge->windows);
+ INIT_LIST_HEAD(&bridge->dma_resv);
bridge->dev.release = devm_pci_release_host_bridge_dev;

return bridge;
@@ -581,6 +583,7 @@ EXPORT_SYMBOL(devm_pci_alloc_host_bridge);
void pci_free_host_bridge(struct pci_host_bridge *bridge)
{
pci_free_resource_list(&bridge->windows);
+ pci_free_resource_list(&bridge->dma_resv);

kfree(bridge);
}
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 65f1d8c..aa06105 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -487,6 +487,7 @@ struct pci_host_bridge {
void *sysdata;
int busnr;
struct list_head windows; /* resource_entry */
+ struct list_head dma_resv; /* reserv dma ranges */
u8 (*swizzle_irq)(struct pci_dev *, u8 *); /* Platform IRQ swizzler */
int (*map_irq)(const struct pci_dev *, u8, u8);
void (*release_fn)(struct pci_host_bridge *);
--
2.7.4


2019-01-25 10:15:00

by Srinath Mannam

[permalink] [raw]
Subject: [PATCH v3 2/3] iommu/dma: Reserve IOVA for PCI host reserve address list

PCI host bridge has list of resource entries contain address ranges for
which IOVA address mapping has to be reserve. These address ranges are
the address holes in dma-ranges DT property.

It is similar to PCI IO resources address ranges reserving in IOMMU for
each EP connected to host bridge.

Signed-off-by: Srinath Mannam <[email protected]>
Based-on-patch-by: Oza Pawandeep <[email protected]>
Reviewed-by: Oza Pawandeep <[email protected]>
---
drivers/iommu/dma-iommu.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index d19f3d6..81b591b 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -221,6 +221,14 @@ static void iova_reserve_pci_windows(struct pci_dev *dev,
hi = iova_pfn(iovad, window->res->end - window->offset);
reserve_iova(iovad, lo, hi);
}
+
+ /* Get reserved DMA windows from host bridge */
+ resource_list_for_each_entry(window, &bridge->dma_resv) {
+
+ lo = iova_pfn(iovad, window->res->start - window->offset);
+ hi = iova_pfn(iovad, window->res->end - window->offset);
+ reserve_iova(iovad, lo, hi);
+ }
}

static int iova_reserve_iommu_regions(struct device *dev,
--
2.7.4


2019-01-25 10:16:24

by Srinath Mannam

[permalink] [raw]
Subject: [PATCH v3 3/3] PCI: iproc: Add dma reserve resources to host

IPROC host has the limitation that it can use only those address ranges
given by dma-ranges property as inbound address. So that the memory
address holes in dma-ranges should be reserved to allocate as DMA address.

Inbound address of host accessed by PCIe devices will not be translated
before it comes to IOMMU or directly to PE. But the limitation of this
host is, access to few address ranges are ignored. So that IOVA ranges
for these address ranges have to be reserved.

All such addresses ranges are created as resource entries by parsing
dma-ranges DT parameter and add to dma_resv list of PCI host bridge.

Ex:
dma-ranges = < \
0x43000000 0x00 0x80000000 0x00 0x80000000 0x00 0x80000000 \
0x43000000 0x08 0x00000000 0x08 0x00000000 0x08 0x00000000 \
0x43000000 0x80 0x00000000 0x80 0x00000000 0x40 0x00000000>

In the above example of dma-ranges, memory address from
0x0 - 0x80000000,
0x100000000 - 0x800000000,
0x1000000000 - 0x8000000000 and
0x10000000000 - 0xffffffffffffffff.
are not allowed to use as inbound addresses.
So that we need to add these address ranges to dma_resv list to reserve
corresponding IOVA address ranges.

Signed-off-by: Srinath Mannam <[email protected]>
Based-on-patch-by: Oza Pawandeep <[email protected]>
Reviewed-by: Oza Pawandeep <[email protected]>
---
drivers/pci/controller/pcie-iproc.c | 51 ++++++++++++++++++++++++++++++++++++-
1 file changed, 50 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/controller/pcie-iproc.c b/drivers/pci/controller/pcie-iproc.c
index c20fd6b..b7bef1c 100644
--- a/drivers/pci/controller/pcie-iproc.c
+++ b/drivers/pci/controller/pcie-iproc.c
@@ -1146,25 +1146,74 @@ static int iproc_pcie_setup_ib(struct iproc_pcie *pcie,
return ret;
}

+static int
+iproc_pcie_add_dma_resv_range(struct device *dev, struct list_head *resources,
+ uint64_t start, uint64_t end)
+{
+ struct resource *res;
+
+ res = devm_kzalloc(dev, sizeof(struct resource), GFP_KERNEL);
+ if (!res)
+ return -ENOMEM;
+
+ res->start = (resource_size_t)start;
+ res->end = (resource_size_t)end;
+ pci_add_resource_offset(resources, res, 0);
+
+ return 0;
+}
+
static int iproc_pcie_map_dma_ranges(struct iproc_pcie *pcie)
{
+ struct pci_host_bridge *host = pci_host_bridge_from_priv(pcie);
struct of_pci_range range;
struct of_pci_range_parser parser;
int ret;
+ uint64_t start, end;
+ LIST_HEAD(resources);

/* Get the dma-ranges from DT */
ret = of_pci_dma_range_parser_init(&parser, pcie->dev->of_node);
if (ret)
return ret;

+ start = 0;
for_each_of_pci_range(&parser, &range) {
+ end = range.pci_addr;
+ /* dma-ranges list expected in sorted order */
+ if (end < start) {
+ ret = -EINVAL;
+ goto out;
+ }
/* Each range entry corresponds to an inbound mapping region */
ret = iproc_pcie_setup_ib(pcie, &range, IPROC_PCIE_IB_MAP_MEM);
if (ret)
- return ret;
+ goto out;
+
+ if (end - start) {
+ ret = iproc_pcie_add_dma_resv_range(pcie->dev,
+ &resources,
+ start, end);
+ if (ret)
+ goto out;
+ }
+ start = range.pci_addr + range.size;
}

+ end = DMA_BIT_MASK(sizeof(dma_addr_t) * BITS_PER_BYTE);
+ if (end - start) {
+ ret = iproc_pcie_add_dma_resv_range(pcie->dev, &resources,
+ start, end);
+ if (ret)
+ goto out;
+ }
+
+ list_splice_init(&resources, &host->dma_resv);
+
return 0;
+out:
+ pci_free_resource_list(&resources);
+ return ret;
}

static int iproce_pcie_get_msi(struct iproc_pcie *pcie,
--
2.7.4


2019-02-21 09:00:37

by Srinath Mannam

[permalink] [raw]
Subject: Re: [PATCH v3 0/3] PCIe Host request to reserve IOVA

Hi Bjorn,

Please help to review this patch series.
Thank you.

Regards,
Srinath.
On Fri, Jan 25, 2019 at 3:44 PM Srinath Mannam
<[email protected]> wrote:
>
> Few SOCs have limitation that their PCIe host can't allow few inbound
> address ranges. Allowed inbound address ranges are listed in dma-ranges
> DT property and this address ranges are required to do IOVA mapping.
> Remaining address ranges have to be reserved in IOVA mapping.
>
> PCIe Host driver of those SOCs has to list all address ranges which have
> to reserve their IOVA address into PCIe host bridge resource entry list.
> IOMMU framework will reserve these IOVAs while initializing IOMMU domain.
>
> This patch set is based on Linux-5.0-rc2.
>
> Changes from v2:
> - Patch set rebased to Linux-5.0-rc2
>
> Changes from v1:
> - Addressed Oza review comments.
>
> Srinath Mannam (3):
> PCI: Add dma-resv window list
> iommu/dma: Reserve IOVA for PCI host reserve address list
> PCI: iproc: Add dma reserve resources to host
>
> drivers/iommu/dma-iommu.c | 8 ++++++
> drivers/pci/controller/pcie-iproc.c | 51 ++++++++++++++++++++++++++++++++++++-
> drivers/pci/probe.c | 3 +++
> include/linux/pci.h | 1 +
> 4 files changed, 62 insertions(+), 1 deletion(-)
>
> --
> 2.7.4
>

2019-03-27 08:45:25

by Srinath Mannam

[permalink] [raw]
Subject: Re: [PATCH v3 0/3] PCIe Host request to reserve IOVA

Hi Bjorn,

Could you please help to review the patch series below?

Thanks,
Srinath.

On Thu, Feb 21, 2019 at 2:29 PM Srinath Mannam
<[email protected]> wrote:
>
> Hi Bjorn,
>
> Please help to review this patch series.
> Thank you.
>
> Regards,
> Srinath.
> On Fri, Jan 25, 2019 at 3:44 PM Srinath Mannam
> <[email protected]> wrote:
> >
> > Few SOCs have limitation that their PCIe host can't allow few inbound
> > address ranges. Allowed inbound address ranges are listed in dma-ranges
> > DT property and this address ranges are required to do IOVA mapping.
> > Remaining address ranges have to be reserved in IOVA mapping.
> >
> > PCIe Host driver of those SOCs has to list all address ranges which have
> > to reserve their IOVA address into PCIe host bridge resource entry list.
> > IOMMU framework will reserve these IOVAs while initializing IOMMU domain.
> >
> > This patch set is based on Linux-5.0-rc2.
> >
> > Changes from v2:
> > - Patch set rebased to Linux-5.0-rc2
> >
> > Changes from v1:
> > - Addressed Oza review comments.
> >
> > Srinath Mannam (3):
> > PCI: Add dma-resv window list
> > iommu/dma: Reserve IOVA for PCI host reserve address list
> > PCI: iproc: Add dma reserve resources to host
> >
> > drivers/iommu/dma-iommu.c | 8 ++++++
> > drivers/pci/controller/pcie-iproc.c | 51 ++++++++++++++++++++++++++++++++++++-
> > drivers/pci/probe.c | 3 +++
> > include/linux/pci.h | 1 +
> > 4 files changed, 62 insertions(+), 1 deletion(-)
> >
> > --
> > 2.7.4
> >

2019-03-27 15:03:41

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH v3 0/3] PCIe Host request to reserve IOVA

On 25/01/2019 10:13, Srinath Mannam wrote:
> Few SOCs have limitation that their PCIe host can't allow few inbound
> address ranges. Allowed inbound address ranges are listed in dma-ranges
> DT property and this address ranges are required to do IOVA mapping.
> Remaining address ranges have to be reserved in IOVA mapping.
>
> PCIe Host driver of those SOCs has to list all address ranges which have
> to reserve their IOVA address into PCIe host bridge resource entry list.
> IOMMU framework will reserve these IOVAs while initializing IOMMU domain.

FWIW I'm still only interested in solving this problem generically,
because in principle it's not specific to PCI, for PCI it's certainly
not specific to iproc, and either way it's not specific to DT. That
said, I don't care strongly enough to keep pushing back on this
implementation outright, since it's not something which couldn't be
cleaned up 'properly' in future.

One general comment I'd make, though, is that AFAIK PCI has a concept of
inbound windows much more than it has a concept of gaps-between-windows,
so if the PCI layer is going to track anything it should probably be the
actual windows, and leave the DMA layer to invert them into the
reservations it cares about as it consumes the list. That way you can
also avoid the undocumented requirement for the firmware to keep the
ranges property sorted in the first place.

Robin.

>
> This patch set is based on Linux-5.0-rc2.
>
> Changes from v2:
> - Patch set rebased to Linux-5.0-rc2
>
> Changes from v1:
> - Addressed Oza review comments.
>
> Srinath Mannam (3):
> PCI: Add dma-resv window list
> iommu/dma: Reserve IOVA for PCI host reserve address list
> PCI: iproc: Add dma reserve resources to host
>
> drivers/iommu/dma-iommu.c | 8 ++++++
> drivers/pci/controller/pcie-iproc.c | 51 ++++++++++++++++++++++++++++++++++++-
> drivers/pci/probe.c | 3 +++
> include/linux/pci.h | 1 +
> 4 files changed, 62 insertions(+), 1 deletion(-)
>

2019-03-28 10:35:18

by Srinath Mannam

[permalink] [raw]
Subject: Re: [PATCH v3 0/3] PCIe Host request to reserve IOVA

Hi Robin,

Thanks for your feedback. Please see my reply in line.

On Wed, Mar 27, 2019 at 8:32 PM Robin Murphy <[email protected]> wrote:
>
> On 25/01/2019 10:13, Srinath Mannam wrote:
> > Few SOCs have limitation that their PCIe host can't allow few inbound
> > address ranges. Allowed inbound address ranges are listed in dma-ranges
> > DT property and this address ranges are required to do IOVA mapping.
> > Remaining address ranges have to be reserved in IOVA mapping.
> >
> > PCIe Host driver of those SOCs has to list all address ranges which have
> > to reserve their IOVA address into PCIe host bridge resource entry list.
> > IOMMU framework will reserve these IOVAs while initializing IOMMU domain.
>
> FWIW I'm still only interested in solving this problem generically,
> because in principle it's not specific to PCI, for PCI it's certainly
> not specific to iproc, and either way it's not specific to DT. That
> said, I don't care strongly enough to keep pushing back on this
> implementation outright, since it's not something which couldn't be
> cleaned up 'properly' in future.
Iproc PCIe host controller supports inbound address translation
feature to restrict access
to allowed address ranges. so that allowed memory ranges need to
program to controller.
allowed address ranges information is passed to controller driver
through dma-ranges DT property.
This feature is specific to iproc PCIe controller, so that I think
this change has to specific to iproc
PCIe driver and DT.
Here I followed the same way how PCI IO regions are reserved
"iova_reserve_pci_windows". so that this
change also specific to PCI.
>
> One general comment I'd make, though, is that AFAIK PCI has a concept of
> inbound windows much more than it has a concept of gaps-between-windows,
> so if the PCI layer is going to track anything it should probably be the
> actual windows, and leave the DMA layer to invert them into the
> reservations it cares about as it consumes the list. That way you can
> also avoid the undocumented requirement for the firmware to keep the
> ranges property sorted in the first place.
This implementation has three parts.
1. parsing dma-ranges and extract allowed and reserved address ranges.
2. program allowed ranges to iproc PCIe controller.
3. reserve list of reserved address ranges in IOMMU layer.
#1 and #2 are done using "of_pci_dma_range_parser_init" in present
iproc PCIe driver.
so that, I listed reserve windows at the same place.
#3 requires list of reserve windows so that I add new
variable(dma_resv) to carry these
reserve windows list to iommu layer from iproc driver layer.
The reasons to not use DMA layer for parsing dma-ranges are,
1. This feature is not generic for all SOCs.
2. To avoid dam-ranges parsing in multiple places, already done in
iproc pcie driver.
3. Need to do modify standard DMA layer source code "of_dma_configure"
4. required a carrier to pass reserved windows list from DMA layer to
IOMMU layer.
5. I followed existing PCIe IO regions reserve procedure done in IOMMU layer.

Regards,
Srinath.
>
> Robin.
>
> >
> > This patch set is based on Linux-5.0-rc2.
> >
> > Changes from v2:
> > - Patch set rebased to Linux-5.0-rc2
> >
> > Changes from v1:
> > - Addressed Oza review comments.
> >
> > Srinath Mannam (3):
> > PCI: Add dma-resv window list
> > iommu/dma: Reserve IOVA for PCI host reserve address list
> > PCI: iproc: Add dma reserve resources to host
> >
> > drivers/iommu/dma-iommu.c | 8 ++++++
> > drivers/pci/controller/pcie-iproc.c | 51 ++++++++++++++++++++++++++++++++++++-
> > drivers/pci/probe.c | 3 +++
> > include/linux/pci.h | 1 +
> > 4 files changed, 62 insertions(+), 1 deletion(-)
> >

2019-03-28 15:48:34

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH v3 0/3] PCIe Host request to reserve IOVA

On 28/03/2019 10:34, Srinath Mannam wrote:
> Hi Robin,
>
> Thanks for your feedback. Please see my reply in line.
>
> On Wed, Mar 27, 2019 at 8:32 PM Robin Murphy <[email protected]> wrote:
>>
>> On 25/01/2019 10:13, Srinath Mannam wrote:
>>> Few SOCs have limitation that their PCIe host can't allow few inbound
>>> address ranges. Allowed inbound address ranges are listed in dma-ranges
>>> DT property and this address ranges are required to do IOVA mapping.
>>> Remaining address ranges have to be reserved in IOVA mapping.
>>>
>>> PCIe Host driver of those SOCs has to list all address ranges which have
>>> to reserve their IOVA address into PCIe host bridge resource entry list.
>>> IOMMU framework will reserve these IOVAs while initializing IOMMU domain.
>>
>> FWIW I'm still only interested in solving this problem generically,
>> because in principle it's not specific to PCI, for PCI it's certainly
>> not specific to iproc, and either way it's not specific to DT. That
>> said, I don't care strongly enough to keep pushing back on this
>> implementation outright, since it's not something which couldn't be
>> cleaned up 'properly' in future.
> Iproc PCIe host controller supports inbound address translation
> feature to restrict access
> to allowed address ranges. so that allowed memory ranges need to
> program to controller.

Other PCIe host controllers work that way too - I know, because I've got
one here. In this particular case, it's not explicit "restriction" so
much as just that the window configuration controls what AXI attributes
are generated on the master side of the PCIe-AXI bridge, and there is no
default attribute. Thus if a PCIe transaction doesn't hit one of the
windows it simply cannot propagate across to the AXI side because the RC
won't know what attributes to emit. It may be conceptually a very
slightly different problem statement, but it still wants the exact same
solution.

> allowed address ranges information is passed to controller driver
> through dma-ranges DT property.

And ACPI has a direct equivalent of dma-ranges in the form of the _DMA
method - compare of_dma_get_range() and acpi_dma_get_range(). Again,
platforms already exist which have this kind of hardware limitation and
boot with both DT and ACPI.

> This feature is specific to iproc PCIe controller, so that I think
> this change has to specific to iproc
> PCIe driver and DT.

The general concept of devices having inaccessible holes within their
nominal DMA mask ultimately boils down to how creative SoC designers can
be with interconnect topologies, so in principle it could end up being
relevant just about anywhere. But as I implied before, since the
examples we know about today all seem to be PCIe IPs, it's not all that
unreasonable to start with this PCI-specific workaround now, and
generalise it later as necessary.

> Here I followed the same way how PCI IO regions are reserved
> "iova_reserve_pci_windows". so that this
> change also specific to PCI.
>>
>> One general comment I'd make, though, is that AFAIK PCI has a concept of
>> inbound windows much more than it has a concept of gaps-between-windows,
>> so if the PCI layer is going to track anything it should probably be the
>> actual windows, and leave the DMA layer to invert them into the
>> reservations it cares about as it consumes the list. That way you can
>> also avoid the undocumented requirement for the firmware to keep the
>> ranges property sorted in the first place.
> This implementation has three parts.
> 1. parsing dma-ranges and extract allowed and reserved address ranges.
> 2. program allowed ranges to iproc PCIe controller.
> 3. reserve list of reserved address ranges in IOMMU layer.
> #1 and #2 are done using "of_pci_dma_range_parser_init" in present
> iproc PCIe driver.
> so that, I listed reserve windows at the same place.
> #3 requires list of reserve windows so that I add new
> variable(dma_resv) to carry these
> reserve windows list to iommu layer from iproc driver layer.
> The reasons to not use DMA layer for parsing dma-ranges are,
> 1. This feature is not generic for all SOCs.
> 2. To avoid dam-ranges parsing in multiple places, already done in
> iproc pcie driver.
> 3. Need to do modify standard DMA layer source code "of_dma_configure"
> 4. required a carrier to pass reserved windows list from DMA layer to
> IOMMU layer.
> 5. I followed existing PCIe IO regions reserve procedure done in IOMMU layer.

Sure, I get that - sorry if it was unclear, but all I meant was simply
taking the flow you currently have, i.e.:

pcie-iproc: parse dma-ranges and make list of gaps between regions
dma-iommu: process list and reserve entries

and tweaking it into this:

pcie-iproc: parse dma-ranges and make list of regions
dma-iommu: process list and reserve gaps between entries

which has the nice benefit of being more robust since the first step can
easily construct the list in correctly-sorted order regardless of the
order in which the DT ranges appear.

Robin.

2019-03-29 13:22:53

by Srinath Mannam

[permalink] [raw]
Subject: Re: [PATCH v3 0/3] PCIe Host request to reserve IOVA

Hi Robin,

Thanks a lot for detailed clarification.
I will send next patch set with the changes you suggested.

Regards,
Srinath.

On Thu, Mar 28, 2019 at 9:17 PM Robin Murphy <[email protected]> wrote:
>
> On 28/03/2019 10:34, Srinath Mannam wrote:
> > Hi Robin,
> >
> > Thanks for your feedback. Please see my reply in line.
> >
> > On Wed, Mar 27, 2019 at 8:32 PM Robin Murphy <[email protected]> wrote:
> >>
> >> On 25/01/2019 10:13, Srinath Mannam wrote:
> >>> Few SOCs have limitation that their PCIe host can't allow few inbound
> >>> address ranges. Allowed inbound address ranges are listed in dma-ranges
> >>> DT property and this address ranges are required to do IOVA mapping.
> >>> Remaining address ranges have to be reserved in IOVA mapping.
> >>>
> >>> PCIe Host driver of those SOCs has to list all address ranges which have
> >>> to reserve their IOVA address into PCIe host bridge resource entry list.
> >>> IOMMU framework will reserve these IOVAs while initializing IOMMU domain.
> >>
> >> FWIW I'm still only interested in solving this problem generically,
> >> because in principle it's not specific to PCI, for PCI it's certainly
> >> not specific to iproc, and either way it's not specific to DT. That
> >> said, I don't care strongly enough to keep pushing back on this
> >> implementation outright, since it's not something which couldn't be
> >> cleaned up 'properly' in future.
> > Iproc PCIe host controller supports inbound address translation
> > feature to restrict access
> > to allowed address ranges. so that allowed memory ranges need to
> > program to controller.
>
> Other PCIe host controllers work that way too - I know, because I've got
> one here. In this particular case, it's not explicit "restriction" so
> much as just that the window configuration controls what AXI attributes
> are generated on the master side of the PCIe-AXI bridge, and there is no
> default attribute. Thus if a PCIe transaction doesn't hit one of the
> windows it simply cannot propagate across to the AXI side because the RC
> won't know what attributes to emit. It may be conceptually a very
> slightly different problem statement, but it still wants the exact same
> solution.
>
> > allowed address ranges information is passed to controller driver
> > through dma-ranges DT property.
>
> And ACPI has a direct equivalent of dma-ranges in the form of the _DMA
> method - compare of_dma_get_range() and acpi_dma_get_range(). Again,
> platforms already exist which have this kind of hardware limitation and
> boot with both DT and ACPI.
>
> > This feature is specific to iproc PCIe controller, so that I think
> > this change has to specific to iproc
> > PCIe driver and DT.
>
> The general concept of devices having inaccessible holes within their
> nominal DMA mask ultimately boils down to how creative SoC designers can
> be with interconnect topologies, so in principle it could end up being
> relevant just about anywhere. But as I implied before, since the
> examples we know about today all seem to be PCIe IPs, it's not all that
> unreasonable to start with this PCI-specific workaround now, and
> generalise it later as necessary.
>
> > Here I followed the same way how PCI IO regions are reserved
> > "iova_reserve_pci_windows". so that this
> > change also specific to PCI.
> >>
> >> One general comment I'd make, though, is that AFAIK PCI has a concept of
> >> inbound windows much more than it has a concept of gaps-between-windows,
> >> so if the PCI layer is going to track anything it should probably be the
> >> actual windows, and leave the DMA layer to invert them into the
> >> reservations it cares about as it consumes the list. That way you can
> >> also avoid the undocumented requirement for the firmware to keep the
> >> ranges property sorted in the first place.
> > This implementation has three parts.
> > 1. parsing dma-ranges and extract allowed and reserved address ranges.
> > 2. program allowed ranges to iproc PCIe controller.
> > 3. reserve list of reserved address ranges in IOMMU layer.
> > #1 and #2 are done using "of_pci_dma_range_parser_init" in present
> > iproc PCIe driver.
> > so that, I listed reserve windows at the same place.
> > #3 requires list of reserve windows so that I add new
> > variable(dma_resv) to carry these
> > reserve windows list to iommu layer from iproc driver layer.
> > The reasons to not use DMA layer for parsing dma-ranges are,
> > 1. This feature is not generic for all SOCs.
> > 2. To avoid dam-ranges parsing in multiple places, already done in
> > iproc pcie driver.
> > 3. Need to do modify standard DMA layer source code "of_dma_configure"
> > 4. required a carrier to pass reserved windows list from DMA layer to
> > IOMMU layer.
> > 5. I followed existing PCIe IO regions reserve procedure done in IOMMU layer.
>
> Sure, I get that - sorry if it was unclear, but all I meant was simply
> taking the flow you currently have, i.e.:
>
> pcie-iproc: parse dma-ranges and make list of gaps between regions
> dma-iommu: process list and reserve entries
>
> and tweaking it into this:
>
> pcie-iproc: parse dma-ranges and make list of regions
> dma-iommu: process list and reserve gaps between entries
>
> which has the nice benefit of being more robust since the first step can
> easily construct the list in correctly-sorted order regardless of the
> order in which the DT ranges appear.
>
> Robin.