2004-10-14 12:48:04

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH] Introduce PCI <-> CPU address conversion [1/2]


Some machines have a different address space on the PCI bus from the
CPU's bus. This is currently fixed up in pcibios_fixup_bus(). However,
this is not called for hotplug devices. Calling pcibios_fixup_bus() when
a device is hotplugged onto a bus is also wrong as it would attempt to
fixup devices that have already been fixed up with potentially horrific
consequences.

This patch teaches the generic PCI layer that there may be different
address spaces, and converts from bus views to cpu views when reading
from BARs. Some drivers (eg sym2, acpiphp) need to go back the other
way, so it also introduces the inverse operation.

Architectures can migrate to the new pci_phys_to_bus() / pci_bus_to_phys()
interface in their own time; I shall post the patch for ia64 seperately.

drivers/pci/probe.c | 99 +++++++++++++++++++++++++++------------------------
drivers/pci/quirks.c | 4 +-
include/linux/pci.h | 11 +++++
3 files changed, 67 insertions(+), 47 deletions(-)

Index: pci-2.6/drivers/pci/probe.c
===================================================================
RCS file: /var/cvs/linux-2.6/drivers/pci/probe.c,v
retrieving revision 1.14
diff -u -p -r1.14 probe.c
--- pci-2.6/drivers/pci/probe.c 13 Sep 2004 15:23:21 -0000 1.14
+++ pci-2.6/drivers/pci/probe.c 14 Oct 2004 12:04:02 -0000
@@ -82,9 +82,10 @@ static inline unsigned int pci_calc_reso
/*
* Find the extent of a PCI decode..
*/
-static u32 pci_size(u32 base, u32 maxbase, unsigned long mask)
+static unsigned long
+pci_size(unsigned long base, unsigned long maxbase, unsigned long mask)
{
- u32 size = mask & maxbase; /* Find the significant bits */
+ unsigned long size = mask & maxbase; /* Find the significant bits */
if (!size)
return 0;

@@ -100,14 +101,18 @@ static u32 pci_size(u32 base, u32 maxbas
return size;
}

+#define IS_MEMORY(l) (((l) & PCI_BASE_ADDRESS_SPACE) == \
+ PCI_BASE_ADDRESS_SPACE_MEMORY)
+#define IS_64BIT(l) (((l) & PCI_BASE_ADDRESS_MEM_TYPE_64) != 0)
+
static void pci_read_bases(struct pci_dev *dev, unsigned int howmany, int rom)
{
- unsigned int pos, reg, next;
+ unsigned int pos, reg;
u32 l, sz;
struct resource *res;

- for(pos=0; pos<howmany; pos = next) {
- next = pos+1;
+ for (pos = 0; pos < howmany; pos++) {
+ unsigned long start, size;
res = &dev->resource[pos];
res->name = pci_name(dev);
reg = PCI_BASE_ADDRESS_0 + (pos << 2);
@@ -119,43 +124,45 @@ static void pci_read_bases(struct pci_de
continue;
if (l == 0xffffffff)
l = 0;
- if ((l & PCI_BASE_ADDRESS_SPACE) == PCI_BASE_ADDRESS_SPACE_MEMORY) {
- sz = pci_size(l, sz, PCI_BASE_ADDRESS_MEM_MASK);
- if (!sz)
- continue;
- res->start = l & PCI_BASE_ADDRESS_MEM_MASK;
- res->flags |= l & ~PCI_BASE_ADDRESS_MEM_MASK;
- } else {
- sz = pci_size(l, sz, PCI_BASE_ADDRESS_IO_MASK & 0xffff);
- if (!sz)
- continue;
- res->start = l & PCI_BASE_ADDRESS_IO_MASK;
- res->flags |= l & ~PCI_BASE_ADDRESS_IO_MASK;
- }
- res->end = res->start + (unsigned long) sz;
- res->flags |= pci_calc_resource_flags(l);
- if ((l & (PCI_BASE_ADDRESS_SPACE | PCI_BASE_ADDRESS_MEM_TYPE_MASK))
- == (PCI_BASE_ADDRESS_SPACE_MEMORY | PCI_BASE_ADDRESS_MEM_TYPE_64)) {
- pci_read_config_dword(dev, reg+4, &l);
- next++;
+ if (IS_MEMORY(l)) {
+ size = sz;
+ start = l;
+ if (IS_64BIT(l)) {
+ pos++;
+ pci_read_config_dword(dev, reg+4, &l);
#if BITS_PER_LONG == 64
- res->start |= ((unsigned long) l) << 32;
- res->end = res->start + sz;
- pci_write_config_dword(dev, reg+4, ~0);
- pci_read_config_dword(dev, reg+4, &sz);
- pci_write_config_dword(dev, reg+4, l);
- if (~sz)
- res->end = res->start + 0xffffffff +
- (((unsigned long) ~sz) << 32);
+ start |= ((unsigned long) l) << 32;
+ pci_write_config_dword(dev, reg+4, ~0);
+ pci_read_config_dword(dev, reg+4, &sz);
+ pci_write_config_dword(dev, reg+4, l);
+ size |= ((unsigned long) sz) << 32;
#else
- if (l) {
- printk(KERN_ERR "PCI: Unable to handle 64-bit address for device %s\n", pci_name(dev));
- res->start = 0;
- res->flags = 0;
- continue;
- }
+ if (l) {
+ printk(KERN_ERR "PCI: Unable to handle "
+ "64-bit address for "
+ "device %s\n",
+ pci_name(dev));
+ res->flags = 0;
+ continue;
+ }
#endif
+ }
+
+ size = pci_size(start, size, PCI_BASE_ADDRESS_MEM_MASK);
+ if (!size)
+ continue;
+ start = start & PCI_BASE_ADDRESS_MEM_MASK;
+ res->flags = start & ~PCI_BASE_ADDRESS_MEM_MASK;
+ } else {
+ size = pci_size(l, sz, PCI_BASE_ADDRESS_IO_MASK & 0xffff);
+ if (!size)
+ continue;
+ start = l & PCI_BASE_ADDRESS_IO_MASK;
+ res->flags = l & ~PCI_BASE_ADDRESS_IO_MASK;
}
+ res->flags |= pci_calc_resource_flags(res->flags);
+ res->start = pci_bus_to_phys(dev, start, res->flags);
+ res->end = res->start + size;
}
if (rom) {
dev->rom_base_reg = rom;
@@ -173,7 +180,9 @@ static void pci_read_bases(struct pci_de
res->flags = (l & PCI_ROM_ADDRESS_ENABLE) |
IORESOURCE_MEM | IORESOURCE_PREFETCH |
IORESOURCE_READONLY | IORESOURCE_CACHEABLE;
- res->start = l & PCI_ROM_ADDRESS_MASK;
+ res->start = pci_bus_to_phys(dev,
+ l & PCI_ROM_ADDRESS_MASK,
+ res->flags);
res->end = res->start + (unsigned long) sz;
}
}
@@ -218,8 +227,8 @@ void __devinit pci_read_bridge_bases(str

if (base <= limit) {
res->flags = (io_base_lo & PCI_IO_RANGE_TYPE_MASK) | IORESOURCE_IO;
- res->start = base;
- res->end = limit + 0xfff;
+ res->start = pci_bus_to_phys(dev, base, IORESOURCE_IO);
+ res->end = pci_bus_to_phys(dev, limit + 0xfff, IORESOURCE_IO);
}

res = child->resource[1];
@@ -229,8 +238,8 @@ void __devinit pci_read_bridge_bases(str
limit = (mem_limit_lo & PCI_MEMORY_RANGE_MASK) << 16;
if (base <= limit) {
res->flags = (mem_base_lo & PCI_MEMORY_RANGE_TYPE_MASK) | IORESOURCE_MEM;
- res->start = base;
- res->end = limit + 0xfffff;
+ res->start = pci_bus_to_phys(dev, base, IORESOURCE_MEM);
+ res->end = pci_bus_to_phys(dev, limit + 0xfffff, IORESOURCE_MEM);
}

res = child->resource[2];
@@ -255,8 +264,8 @@ void __devinit pci_read_bridge_bases(str
}
if (base <= limit) {
res->flags = (mem_base_lo & PCI_MEMORY_RANGE_TYPE_MASK) | IORESOURCE_MEM | IORESOURCE_PREFETCH;
- res->start = base;
- res->end = limit + 0xfffff;
+ res->start = pci_bus_to_phys(dev, base, IORESOURCE_MEM);
+ res->end = pci_bus_to_phys(dev, limit + 0xfffff, IORESOURCE_MEM);
}
}

Index: pci-2.6/drivers/pci/quirks.c
===================================================================
RCS file: /var/cvs/linux-2.6/drivers/pci/quirks.c,v
retrieving revision 1.16
diff -u -p -r1.16 quirks.c
--- pci-2.6/drivers/pci/quirks.c 13 Sep 2004 15:23:21 -0000 1.16
+++ pci-2.6/drivers/pci/quirks.c 14 Oct 2004 12:04:02 -0000
@@ -236,8 +236,8 @@ static void __devinit quirk_io_region(st
struct resource *res = dev->resource + nr;

res->name = pci_name(dev);
- res->start = region;
- res->end = region + size - 1;
+ res->start = pci_bus_to_phys(dev, region, IORESOURCE_IO);
+ res->end = res->start + size - 1;
res->flags = IORESOURCE_IO;
pci_claim_resource(dev, nr);
}
Index: pci-2.6/include/linux/pci.h
===================================================================
RCS file: /var/cvs/linux-2.6/include/linux/pci.h,v
retrieving revision 1.18
diff -u -p -r1.18 pci.h
--- pci-2.6/include/linux/pci.h 13 Sep 2004 15:24:12 -0000 1.18
+++ pci-2.6/include/linux/pci.h 14 Oct 2004 12:04:07 -0000
@@ -994,6 +994,17 @@ static inline char *pci_name(struct pci_
#endif

/*
+ * Convert between the CPU's view of addresses on a PCI card and the PCI
+ * device's view of the same location. The default implementation is a no-op
+ * as most architectures have the same addresses on the CPU and PCI busses.
+ */
+
+#ifndef pci_phys_to_bus
+#define pci_phys_to_bus(busdev, addr, flags) (addr)
+#define pci_bus_to_phys(busdev, addr, flags) (addr)
+#endif
+
+/*
* The world is not perfect and supplies us with broken PCI devices.
* For at least a part of these bugs we need a work-around, so both
* generic (drivers/pci/quirks.c) and per-architecture code can define
--
"Next the statesmen will invent cheap lies, putting the blame upon
the nation that is attacked, and every man will be glad of those
conscience-soothing falsities, and will diligently study them, and refuse
to examine any refutations of them; and thus he will by and by convince
himself that the war is just, and will thank God for the better sleep
he enjoys after this process of grotesque self-deception." -- Mark Twain


2004-10-14 12:53:56

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] Introduce PCI <-> CPU address conversion [1/2]

> +#define IS_MEMORY(l) (((l) & PCI_BASE_ADDRESS_SPACE) == \
> + PCI_BASE_ADDRESS_SPACE_MEMORY)
> +#define IS_64BIT(l) (((l) & PCI_BASE_ADDRESS_MEM_TYPE_64) != 0)

Should got to pci.h with more descriptive names

> /*
> + * Convert between the CPU's view of addresses on a PCI card and the PCI
> + * device's view of the same location. The default implementation is a no-op
> + * as most architectures have the same addresses on the CPU and PCI busses.
> + */
> +
> +#ifndef pci_phys_to_bus
> +#define pci_phys_to_bus(busdev, addr, flags) (addr)
> +#define pci_bus_to_phys(busdev, addr, flags) (addr)
> +#endif

I'd rather have this declared in every architectures asm/ header, so it's
more explicit that it's an per-arch thing. Also make it a static inline
so we get typechecking.

2004-10-14 13:53:28

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH] Introduce PCI <-> CPU address conversion [1/2]

On Thu, Oct 14, 2004 at 01:53:48PM +0100, Christoph Hellwig wrote:
> I'd rather have this declared in every architectures asm/ header, so it's
> more explicit that it's an per-arch thing. Also make it a static inline
> so we get typechecking.

I actually don't want typechecking. Sometimes you have a device and
sometimes you have a bus. You can get everything you want (sysdata)
from either, so there's no point in doing dev->bus when all you needed
was right there.

--
"Next the statesmen will invent cheap lies, putting the blame upon
the nation that is attacked, and every man will be glad of those
conscience-soothing falsities, and will diligently study them, and refuse
to examine any refutations of them; and thus he will by and by convince
himself that the war is just, and will thank God for the better sleep
he enjoys after this process of grotesque self-deception." -- Mark Twain

2004-10-14 14:32:54

by Ivan Kokshaysky

[permalink] [raw]
Subject: Re: [PATCH] Introduce PCI <-> CPU address conversion [1/2]

On Thu, Oct 14, 2004 at 01:47:37PM +0100, Matthew Wilcox wrote:
> Some machines have a different address space on the PCI bus from the
> CPU's bus. This is currently fixed up in pcibios_fixup_bus(). However,
> this is not called for hotplug devices. Calling pcibios_fixup_bus() when
> a device is hotplugged onto a bus is also wrong as it would attempt to
> fixup devices that have already been fixed up with potentially horrific
> consequences.

This logic makes sense only if you have sort of firmware which
properly initializes the hotplug devices, so I think that the fixup
should belong in that particular hotplug driver (or architecture).
In general case the newly inserted device will have 0s in the BARs,
so there is no point for bus to CPU conversion. You have to use
pci_assign_resource() which does know about different address
spaces.

> This patch teaches the generic PCI layer that there may be different
> address spaces, and converts from bus views to cpu views when reading
> from BARs. Some drivers (eg sym2, acpiphp) need to go back the other
> way, so it also introduces the inverse operation.

This one already exists - pcibios_resource_to_bus().

Ivan.

2004-10-14 14:42:08

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH] Introduce PCI <-> CPU address conversion [1/2]

On Thu, Oct 14, 2004 at 06:27:04PM +0400, Ivan Kokshaysky wrote:
> On Thu, Oct 14, 2004 at 01:47:37PM +0100, Matthew Wilcox wrote:
> > Some machines have a different address space on the PCI bus from the
> > CPU's bus. This is currently fixed up in pcibios_fixup_bus(). However,
> > this is not called for hotplug devices. Calling pcibios_fixup_bus() when
> > a device is hotplugged onto a bus is also wrong as it would attempt to
> > fixup devices that have already been fixed up with potentially horrific
> > consequences.
>
> This logic makes sense only if you have sort of firmware which
> properly initializes the hotplug devices

Yes, this is the case for the ACPI hotplug driver, for example.

> so I think that the fixup
> should belong in that particular hotplug driver (or architecture).

*sigh*. Greg rejected a patch that did that. He wanted it fixed
more generally.

> > This patch teaches the generic PCI layer that there may be different
> > address spaces, and converts from bus views to cpu views when reading
> > from BARs. Some drivers (eg sym2, acpiphp) need to go back the other
> > way, so it also introduces the inverse operation.
>
> This one already exists - pcibios_resource_to_bus().

I can't use it in the symbios driver because it only exists on alpha,
arm, mips, parisc, ppc, ppc64, sparc64 and v850. It doesn't exist on
i386, ia64, x86_64, arm26, cris, h8300, m32r, sh, sh64 or sparc.

Perhaps from your point of view this patch makes more sense as a cleanup.
Rather than having all the duplicate code in all the architecture
pcibios_fixup_device and pcibios_resource_to_bus, they can implement
pci_bus_to_phys and pci_phys_to_bus.

--
"Next the statesmen will invent cheap lies, putting the blame upon
the nation that is attacked, and every man will be glad of those
conscience-soothing falsities, and will diligently study them, and refuse
to examine any refutations of them; and thus he will by and by convince
himself that the war is just, and will thank God for the better sleep
he enjoys after this process of grotesque self-deception." -- Mark Twain

2004-10-14 18:34:56

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] Introduce PCI <-> CPU address conversion [1/2]

On Thu, Oct 14, 2004 at 02:53:23PM +0100, Matthew Wilcox wrote:
> On Thu, Oct 14, 2004 at 01:53:48PM +0100, Christoph Hellwig wrote:
> > I'd rather have this declared in every architectures asm/ header, so it's
> > more explicit that it's an per-arch thing. Also make it a static inline
> > so we get typechecking.
>
> I actually don't want typechecking. Sometimes you have a device and
> sometimes you have a bus. You can get everything you want (sysdata)
> from either, so there's no point in doing dev->bus when all you needed
> was right there.

For some architectures the sysdata is different for bus vs device, so
yes, we do want strict typechecking.

2004-10-14 20:23:19

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH] Introduce PCI <-> CPU address conversion [1/2]

On Thu, Oct 14, 2004 at 07:00:06PM +0100, Christoph Hellwig wrote:
> For some architectures the sysdata is different for bus vs device, so
> yes, we do want strict typechecking.

How interesting. I was under the impression that dev->sysdata was always
a copy of the bus's. If that's not guaranteed, then we're just going
to have to dereference the additional pointer and use the bus' sysdata.

--
"Next the statesmen will invent cheap lies, putting the blame upon
the nation that is attacked, and every man will be glad of those
conscience-soothing falsities, and will diligently study them, and refuse
to examine any refutations of them; and thus he will by and by convince
himself that the war is just, and will thank God for the better sleep
he enjoys after this process of grotesque self-deception." -- Mark Twain

2004-10-14 22:55:57

by Colin Ngam

[permalink] [raw]
Subject: Re: [PATCH] Introduce PCI <-> CPU address conversion [1/2]

Matthew Wilcox wrote:

> On Thu, Oct 14, 2004 at 07:00:06PM +0100, Christoph Hellwig wrote:
> > For some architectures the sysdata is different for bus vs device, so
> > yes, we do want strict typechecking.
>
> How interesting. I was under the impression that dev->sysdata was always
> a copy of the bus's. If that's not guaranteed, then we're just going
> to have to dereference the additional pointer and use the bus' sysdata.

Hi Matthew,

On SGI's Altix system, the sysdata for the device is very much different than
the sysdata for the bus.

Thanks.

colin

>
>
> --
> "Next the statesmen will invent cheap lies, putting the blame upon
> the nation that is attacked, and every man will be glad of those
> conscience-soothing falsities, and will diligently study them, and refuse
> to examine any refutations of them; and thus he will by and by convince
> himself that the war is just, and will thank God for the better sleep
> he enjoys after this process of grotesque self-deception." -- Mark Twain
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2004-10-15 00:38:31

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH] Introduce PCI <-> CPU address conversion [1/2]

On Thu, Oct 14, 2004 at 05:37:51PM -0500, Colin Ngam wrote:
> On SGI's Altix system, the sysdata for the device is very much different than
> the sysdata for the bus.

That's fascinating, because ia64 is one of the architectures that relies
on sysdata being the same in both the bus and the device:

#define PCI_CONTROLLER(busdev) ((struct pci_controller *) busdev->sysdata)

In various places, we have
struct pci_controller *controller = PCI_CONTROLLER(dev);
and
if (PCI_CONTROLLER(bus)->iommu)

So what the hell does Altix do? Which sysdata can be used to get to the
pci_controller? This seems like a horrible mistake to me.

--
"Next the statesmen will invent cheap lies, putting the blame upon
the nation that is attacked, and every man will be glad of those
conscience-soothing falsities, and will diligently study them, and refuse
to examine any refutations of them; and thus he will by and by convince
himself that the war is just, and will thank God for the better sleep
he enjoys after this process of grotesque self-deception." -- Mark Twain

2004-10-15 07:26:50

by Richard Henderson

[permalink] [raw]
Subject: Re: [PATCH] Introduce PCI <-> CPU address conversion [1/2]

On Thu, Oct 14, 2004 at 03:39:24PM +0100, Matthew Wilcox wrote:
> I can't use it in the symbios driver because it only exists on alpha,
> arm, mips, parisc, ppc, ppc64, sparc64 and v850. It doesn't exist on
> i386, ia64, x86_64, arm26, cris, h8300, m32r, sh, sh64 or sparc.

So you conclude from 50% of the ports implementing things in a
particular way that you should invent a totally new interface?
Isn't the obvious solution to implement the existing interface
for the ports that don't have it?


r~

2004-10-15 10:35:00

by Ivan Kokshaysky

[permalink] [raw]
Subject: Re: [PATCH] Introduce PCI <-> CPU address conversion [1/2]

On Fri, Oct 15, 2004 at 12:19:26AM -0700, Richard Henderson wrote:
> So you conclude from 50% of the ports implementing things in a
> particular way that you should invent a totally new interface?
> Isn't the obvious solution to implement the existing interface
> for the ports that don't have it?

Definitely.
Besides, pci_bus_to_phys() name is quite misleading. Sounds like
invitation to use phys_to_virt() with the returned value...
pcibios_bus_to_resource as the inverse of pcibios_resource_to_bus
would be much cleaner.

Ivan.