2004-03-23 02:21:33

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Issues with /proc/bus/pci

Hi David !

I'm working on some userland code shell that helps soft booting
video cards, and so for once had to use our nice /proc/bus/pci
API to mmap the IO and memory spaces.

Doing that, I figured out we have a problem. On PPC, we never
implemented properly the 'trick' allowing to map the host bridge
itself (I though it was there but actually I was wrong) so that
mean we can't mmap space outside of the area represented by the
card's BARs.

I could add the host bridge thing fairly easily, but I think it
is not very practical. Well, I probably have to add it anyway in
case any existing stuff uses that, but it's definitely not
practical when you have a bit of useralnd that knows the PCI ID
(domain/bus/devfn) of the card and wants to access the legacy IO
space of that bridge. The problem is finding which pci_dev is
the host bridge, if any since host bridges aren't required to
show up at all.

Right now, I have a nice piece of code that goes straight to the
correct entry. Having to "find" the host means iterating all
devices, asking their domain number, and find the one that is
both of class host bridge and on the same domain. That also mean
properly fixing up any host bridge that doesn't show itself up
as a class host bridge device (you know how things can be especially
in the embedded world) and leaves a potential problem with host
bridges that just don't show up on the bus (that's currently the
case with the HyperTransport host on the G5, though I could fix
that in the long term, I suspect it may happen again elsewhere
as they aren't required to show up as devices).

What do you think of dealing with that with a slight addition to
the current ioctl's, basically adding a pair equivalent to
PCIIOC_MMAP_IS_IO and PCIIOC_MMAP_IS_MEM that would be
PCIIOC_MMAP_IS_HOST_IO and PCIIOC_MMAP_IS_HOST_MEM ? That would
be, imho, the simplest solution, as far as userland is concerned,
though it requires updating the implementation of pci_mmap_page_range()
of all archs that implement it.

Another simpler solution would be to consider that mapping outside
of a device is only ever useful for legacy ISA IO space and simply
fix the archs to allow an IO mapping of the IO region below 0x400
using any device pci_dev (provided the region exist on a given host
of course), since the value passed by userland is an absolute BAR
address in bus space, that would work.

What do you think ?

Ben.



2004-03-23 02:31:30

by David Miller

[permalink] [raw]
Subject: Re: Issues with /proc/bus/pci

On Tue, 23 Mar 2004 13:06:53 +1100
Benjamin Herrenschmidt <[email protected]> wrote:

> I could add the host bridge thing fairly easily, but I think it
> is not very practical. Well, I probably have to add it anyway in
> case any existing stuff uses that, but it's definitely not
> practical when you have a bit of useralnd that knows the PCI ID
> (domain/bus/devfn) of the card and wants to access the legacy IO
> space of that bridge. The problem is finding which pci_dev is
> the host bridge, if any since host bridges aren't required to
> show up at all.

You have a problem. You must implement this 'trick' because things
like xfree86 domain stuff wants it too.

I've been exporting the host PCI bridges to the usespace since day one,
and one only needs walk the devfn/bus numbers properly to find the proper
bridge for a given pci dev, right?

2004-03-23 02:44:21

by Jesse Barnes

[permalink] [raw]
Subject: Re: Issues with /proc/bus/pci

On Monday 22 March 2004 6:06 pm, Benjamin Herrenschmidt wrote:
> case with the HyperTransport host on the G5, though I could fix
> that in the long term, I suspect it may happen again elsewhere
> as they aren't required to show up as devices).

The same thing happens on Altix machines, our xio<->pci bridges don't show up.

> What do you think of dealing with that with a slight addition to
> the current ioctl's, basically adding a pair equivalent to
> PCIIOC_MMAP_IS_IO and PCIIOC_MMAP_IS_MEM that would be
> PCIIOC_MMAP_IS_HOST_IO and PCIIOC_MMAP_IS_HOST_MEM ? That would
> be, imho, the simplest solution, as far as userland is concerned,
> though it requires updating the implementation of pci_mmap_page_range()
> of all archs that implement it.

It's been awhile since I looked at this API, but this would allow userland to
map legacy I/O or mem space for a given device using the above commands?

> Another simpler solution would be to consider that mapping outside
> of a device is only ever useful for legacy ISA IO space and simply
> fix the archs to allow an IO mapping of the IO region below 0x400
> using any device pci_dev (provided the region exist on a given host
> of course), since the value passed by userland is an absolute BAR
> address in bus space, that would work.

This would also work, but then on some platforms (like ia64) legacy space
requires special treatment since target aborts can cause hard fails, so I'd
prefer the previous approach since it would make setup for dealing with such
platforms a bit more explicit.

Thanks,
Jesse

2004-03-23 02:54:55

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Issues with /proc/bus/pci

On Tue, 2004-03-23 at 13:31, David S. Miller wrote:
> On Tue, 23 Mar 2004 13:06:53 +1100
> Benjamin Herrenschmidt <[email protected]> wrote:
>
> > I could add the host bridge thing fairly easily, but I think it
> > is not very practical. Well, I probably have to add it anyway in
> > case any existing stuff uses that, but it's definitely not
> > practical when you have a bit of useralnd that knows the PCI ID
> > (domain/bus/devfn) of the card and wants to access the legacy IO
> > space of that bridge. The problem is finding which pci_dev is
> > the host bridge, if any since host bridges aren't required to
> > show up at all.
>
> You have a problem. You must implement this 'trick' because things
> like xfree86 domain stuff wants it too.

I know, but xfree is probably the only thing that matters and it
can probably still be changed if we go a different way... My main
issue is that there is no guarantee the host bridge shows up as
a PCI device as far as I know. It's not in the spec, some bridges
are currently hidden on some ppc 4xx embedded platforms (though
in that case, we could probably quirk to just hide the BARs since
those are the problem), but in general, I think it may be a
problem.

> I've been exporting the host PCI bridges to the usespace since day one,
> and one only needs walk the devfn/bus numbers properly to find the proper
> bridge for a given pci dev, right?

Well. On PowerMacs (and apparently on pSeries too), I usually have
the host bridge show up as a device, except of G5's hypertransport
(but that's fixable as it does exist, though with a weird config
space access method that I didn't bother implement yet). But it's
not obvious which device is the host bridge ;) In an ideal world,
we could say that you have to walk all devices in the system, and
find the one of class "host" with the same domain number. But
that assumes:

- That the host bridge does show up as a pci device which isn't
required afaik
- That the host bridges does have a PCI class of host bridge,
which may not be true (though that's quirk'able)
- The actual "discovery" of it above from userland isn't that
simple, especially since for compatibility with existing userland,
our /proc/bus/pci/XX numbers don't show the domain number (at
least on ppc32) since we renumber busses to not overlap. I will
change that in the 2.7 timeframe. So I need to actually open all
devices and ask for their domain number with the PCIIOC_CONTROLLER,

This overall makes the mecanism a bit fragile & non trivial imho,
it would be nice to have a simple way to go straight to the bus
mappings given the pci_dev (without having to even bothing "finding"
the host bridge) either with additional ioctl's or just by allowing
the "low IOs" mapping.

What do you think ?

Ben.


2004-03-23 03:48:04

by David Miller

[permalink] [raw]
Subject: Re: Issues with /proc/bus/pci

On Tue, 23 Mar 2004 13:40:11 +1100
Benjamin Herrenschmidt <[email protected]> wrote:

> What do you think ?

Ok, it does sound like we need something else.

Another idea is to always at least provide a "virtual" host
bridge on these weird platforms you mention. You control
the PCI config space etc. operations, so you could handle
the virtual host bridge correctly right?

2004-03-23 04:17:05

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Issues with /proc/bus/pci

On Tue, 2004-03-23 at 14:47, David S. Miller wrote:
> On Tue, 23 Mar 2004 13:40:11 +1100
> Benjamin Herrenschmidt <[email protected]> wrote:
>
> > What do you think ?
>
> Ok, it does sound like we need something else.
>
> Another idea is to always at least provide a "virtual" host
> bridge on these weird platforms you mention. You control
> the PCI config space etc. operations, so you could handle
> the virtual host bridge correctly right?

Yes. Though I'm not sure i like the idea.

Note that I don't have a platform affected by this problem at hand
(except the G5 for which I can tweak to make the HT host show up).
I have to double check some of the weirdest embedded PPC setups though,
those "hiding" the host bridge should have all been converted to just
hide it's BARs by now hopefully. But the theorical problem persists.

And it's still not convenient for userland things that need to access
one video card knowing it's PCI ID to have to iterate around to find
the host bridge, but I can live with that ;) Actually, for my specific
need for this softboot thing, it is powermac specific, so I can just
hack the ppc port to always acccept a mapping of the low 0x400 of IO
space from any PCI device... (provided those are actually in the host
bridge resources).

I'll see what I can do on our side

Ben.


2004-03-23 04:27:26

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Issues with /proc/bus/pci

On Tue, 2004-03-23 at 14:47, David S. Miller wrote:
> On Tue, 23 Mar 2004 13:40:11 +1100
> Benjamin Herrenschmidt <[email protected]> wrote:
>
> > What do you think ?
>
> Ok, it does sound like we need something else.

Ok, finally, I found out the truth about the ppc "fix" for that:
paulus did indeed fix it ... in 2.4 and not 2.5 :( Paul's fix is
simply to have the arch allow mapping of any address that is on
the host resource ranges, whatever the device was passed in (that
is basically the same as considering any device to be the host
bridge).

That should be completely compatible with userland looking for the
host bridge and would let me get the low VGA stuff for my video
softboot hack from the same device fd without having to look for
the host bridge at all.

Ben.