2007-12-13 03:00:37

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Possible issue with dangling PCI BARs

While reworking the powerpc PCI resource management, to make it more
x86-like and use more of the generic code in setup-bus.c and
setup-res.c, I noticed something a bit fishy in the way we deal with
resources that we failed to assign.

That's especially problematic for us as we have to deal with all sort of
more or less crappy firmwares, so we need our PCI code to be as robust
as possible vs. whatever might have been there before the kernel kicks
in.

Now, let's say that we have a device for which a BAR contain a value out
of whack or conflicting with something else. We "see" that in
pcibios_allocate_resources() and set res->start to 0 (and offset
res->end accordingly).

At this stage, we thus have a resource that is marked as unassigned.
However, we haven't actually modified the BAR content. So the device is
still potentially decoding the conflicting addresses (which may have
been allocated to some other device).

Now, normally, pci_assign_unassigned_resources() will take care of that
and assign some new address to this device. However, that may fail
(bridge window too small for example). Which mean we leave a device with
a dangling BAR. In fact, a driver may even successfully call
pci_enable_device_bars() on it, provided that the device has several
resources and that the one that is/are requested by the driver (via the
bitmask argument) are not the one(s) that failed to allocate.

So not only we can have a dangling BAR, but nothing prevent us to
actually go turn IO or MEM decoding on in case it wasn't already the
case on that device.

Ben.


2007-12-13 03:05:25

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Possible issue with dangling PCI BARs


On Thu, 2007-12-13 at 14:00 +1100, Benjamin Herrenschmidt wrote:

.../...

(oops, sent too fast)

> So not only we can have a dangling BAR, but nothing prevent us to
> actually go turn IO or MEM decoding on in case it wasn't already the
> case on that device.

And I was about to say before I clicked "send".. can't we do something like
writing all ff's into the BAR at the same time as we clear res->start ? Isn't
that supposed to pretty much disable decoding on that BAR ? Or not... Probably
still better than leaving it to whatever dangling value it had no ?

Also, maybe we should disable IO and MEM decoding by default on devices for
which one resource of that type failed to allocate ?

Ben.

2007-12-13 03:40:26

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Possible issue with dangling PCI BARs


On Thu, 2007-12-13 at 14:05 +1100, Benjamin Herrenschmidt wrote:
> On Thu, 2007-12-13 at 14:00 +1100, Benjamin Herrenschmidt wrote:
>
> .../...
>
> (oops, sent too fast)
>
> > So not only we can have a dangling BAR, but nothing prevent us to
> > actually go turn IO or MEM decoding on in case it wasn't already the
> > case on that device.
>
> And I was about to say before I clicked "send".. can't we do something like
> writing all ff's into the BAR at the same time as we clear res->start ? Isn't
> that supposed to pretty much disable decoding on that BAR ? Or not... Probably
> still better than leaving it to whatever dangling value it had no ?

Ok, reading some other threads, it seems that writing all ff's will not
be a very good alternative on x86 machines where MMCONFIG sits up
there...

I suppose there is nothing totally safe that can be done, thanks to
Intel not thinking about making BARs individually enable/disable'able
(or size-able without interrupting access, among other numerous fuckups
in the PCI spec).

So if a BAR is left dangling, I think we -must- disable MEM and IO
decoding on the whole device. In fact, the whole trick of passing a
bitmask of required BARs to pci_enable_device_bars() in the first place
doesn't fly.

Yuck.

Ben.

2007-12-13 04:23:19

by Robert Hancock

[permalink] [raw]
Subject: Re: Possible issue with dangling PCI BARs

Benjamin Herrenschmidt wrote:
> On Thu, 2007-12-13 at 14:05 +1100, Benjamin Herrenschmidt wrote:
>> On Thu, 2007-12-13 at 14:00 +1100, Benjamin Herrenschmidt wrote:
>>
>> .../...
>>
>> (oops, sent too fast)
>>
>>> So not only we can have a dangling BAR, but nothing prevent us to
>>> actually go turn IO or MEM decoding on in case it wasn't already the
>>> case on that device.
>> And I was about to say before I clicked "send".. can't we do something like
>> writing all ff's into the BAR at the same time as we clear res->start ? Isn't
>> that supposed to pretty much disable decoding on that BAR ? Or not... Probably
>> still better than leaving it to whatever dangling value it had no ?
>
> Ok, reading some other threads, it seems that writing all ff's will not
> be a very good alternative on x86 machines where MMCONFIG sits up
> there...
>
> I suppose there is nothing totally safe that can be done, thanks to
> Intel not thinking about making BARs individually enable/disable'able
> (or size-able without interrupting access, among other numerous fuckups
> in the PCI spec).
>
> So if a BAR is left dangling, I think we -must- disable MEM and IO
> decoding on the whole device. In fact, the whole trick of passing a
> bitmask of required BARs to pci_enable_device_bars() in the first place
> doesn't fly.
>
> Yuck.

We could do a bit better than that - a common use case with
pci_enable_device_bars would be where the device has some IO space that
we don't care about because we only want to use MMIO space. If we only
want to enable MMIO BARs then we don't need to enable IO decoding, and
in that case it doesn't matter if we failed to find space for the IO
space and it overlaps something else.

It looks like we already handle the "not enabling IO decoding" part in
this case, except that it doesn't look like we ever would disable the
decoding if it was already enabled.

For the case where you say "I want to enable decoding for this MMIO BAR,
but not that one", though, I don't see an obvious way to provide that
guarantee with certainty. Normally, one would expect that if a BAR is
mapped safely outside the decode window of a PCI bridge it's behind,
that it won't ever see the requests and can't respond to them. However,
the Intel chipset MMCONFIG overlap fiasco appears to show that this is
not always the case and in some cases the device can see and respond to
requests outside of the bridge's decode window (with higher decode
priority than the MMCONFIG aperture, even)..

--
Robert Hancock Saskatoon, SK, Canada
To email, remove "nospam" from [email protected]
Home Page: http://www.roberthancock.com/

2007-12-13 05:27:17

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Possible issue with dangling PCI BARs


> We could do a bit better than that - a common use case with
> pci_enable_device_bars would be where the device has some IO space that
> we don't care about because we only want to use MMIO space. If we only
> want to enable MMIO BARs then we don't need to enable IO decoding, and
> in that case it doesn't matter if we failed to find space for the IO
> space and it overlaps something else.

Yes, we could at least separate memory from IO.

> It looks like we already handle the "not enabling IO decoding" part in
> this case, except that it doesn't look like we ever would disable the
> decoding if it was already enabled.

Yup.

> For the case where you say "I want to enable decoding for this MMIO BAR,
> but not that one", though, I don't see an obvious way to provide that
> guarantee with certainty. Normally, one would expect that if a BAR is
> mapped safely outside the decode window of a PCI bridge it's behind,
> that it won't ever see the requests and can't respond to them. However,
> the Intel chipset MMCONFIG overlap fiasco appears to show that this is
> not always the case and in some cases the device can see and respond to
> requests outside of the bridge's decode window (with higher decode
> priority than the MMCONFIG aperture, even)..

Yup, which is why I believe we would be reasonably safe if we did
something along the lines of: when we fail to assign a resource, we
disable decoding on the device. Either both or only the "side" (IO vs.
MEM) of the resource we failed assigning.

In addition, we modify pcibios_enable_device() to verify that if it's
going to enable MEM or IO, there is no BAR of that type that is left
unassigned, even if those aren't part of the mask.

I can try to whip up some code tomorrow I suppose, though I'm always
afraid some dodgy x86 setup will blow up...

Cheers,
Ben.

2007-12-13 09:04:30

by Ivan Kokshaysky

[permalink] [raw]
Subject: Re: Possible issue with dangling PCI BARs

On Wed, Dec 12, 2007 at 10:22:22PM -0600, Robert Hancock wrote:
> For the case where you say "I want to enable decoding for this MMIO BAR,
> but not that one", though, I don't see an obvious way to provide that
> guarantee with certainty. Normally, one would expect that if a BAR is
> mapped safely outside the decode window of a PCI bridge it's behind,
> that it won't ever see the requests and can't respond to them. However,
> the Intel chipset MMCONFIG overlap fiasco appears to show that this is
> not always the case and in some cases the device can see and respond to
> requests outside of the bridge's decode window (with higher decode
> priority than the MMCONFIG aperture, even)..

>From my reading of Intel specs, these priority decode rules apply only
to legacy VGA aperture (which doesn't have a BAR anyway) - that's why you
cannot have both internal and external VGA working together on these
chipsets. But for any normal BAR classic PCI bridge decode still works
as expected.

However, mapping a BAR outside the decode window of a PCI bridge
can be dangerous for another reason - it could clash with DMA from
a sibling device. Either with DMA to system RAM, if you put that BAR
below the parent bridge window, or with MSI, if you put it above...

So disabling memory or IO decode in a command register seems to be
the only safe option. This depends on architecture, though.

Ivan.

2007-12-13 09:14:50

by Ivan Kokshaysky

[permalink] [raw]
Subject: Re: Possible issue with dangling PCI BARs

On Thu, Dec 13, 2007 at 04:26:42PM +1100, Benjamin Herrenschmidt wrote:
> I can try to whip up some code tomorrow I suppose, though I'm always
> afraid some dodgy x86 setup will blow up...

That scares me too, but something like pci_dangling_bar(dev, idx) with
a default (for now) no-op implementation in asm-generic/pci.h should
be safe...

Ivan.

2007-12-13 09:29:56

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Possible issue with dangling PCI BARs


On Thu, 2007-12-13 at 12:14 +0300, Ivan Kokshaysky wrote:
> On Thu, Dec 13, 2007 at 04:26:42PM +1100, Benjamin Herrenschmidt wrote:
> > I can try to whip up some code tomorrow I suppose, though I'm always
> > afraid some dodgy x86 setup will blow up...
>
> That scares me too, but something like pci_dangling_bar(dev, idx) with
> a default (for now) no-op implementation in asm-generic/pci.h should
> be safe...

Well, the code that detects collisions and "unsets" a resource is arch
in the first place, and so is pcibios_enable_device().

So I'm going to do a powerpc implementation & test it a bit with what I
have around, and then do an x86 since with my latest (pending for .25)
patches. powerpc PCI code is -very- similar to x86.

Cheers,
Ben.

2007-12-13 10:30:54

by Alan

[permalink] [raw]
Subject: Re: Possible issue with dangling PCI BARs

> So disabling memory or IO decode in a command register seems to be
> the only safe option. This depends on architecture, though.

You are assuming a degree of sanity that seems unwise (at least for PC
class hardware).

Whether a given BAR is decoded depends not only on the contents of the
BAR but also the hardware configuration *specific* to the device.

The SIL680 for example has an MMIO BAR at BAR5. Control for that BAR is
via MMIO_EN which is a bit in PCI config register 0x8A.

So if we disable the device because of a dangling BAR the users root file
system goes away. If we leave it as is we have to know the
firmware/hardware came up with that BAR disabled or how to control it at
a per device level.

Supporting pci_enable_device_io / pci_enable_device_mmio / pci_iomap_io /
pci_iomap_mmio seems to cover pretty much all the use cases we have.

The users we have right now that are:

- pata_cs5520 (can be dealt with easily)
- old IDE (with the new resource handling for legacy IDE
can use pci_enable_device_io I think, ditto pci/cs5520)
- scx200_acb (looks like a simple substitution works)
- lpfc pci_enable_device_mmio
- qla2xxx pci_enable_device ? (enables IO and MMIO)

Alan

2007-12-13 11:17:36

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Possible issue with dangling PCI BARs


On Thu, 2007-12-13 at 10:24 +0000, Alan Cox wrote:
>
> The SIL680 for example has an MMIO BAR at BAR5. Control for that BAR is
> via MMIO_EN which is a bit in PCI config register 0x8A.
>
> So if we disable the device because of a dangling BAR the users root file
> system goes away. If we leave it as is we have to know the
> firmware/hardware came up with that BAR disabled or how to control it at
> a per device level.

Except that it's very likely that it will be assigned, whether it's used
or not, either by the BIOS or the kernel during
pci_assign_unassign_resources().

If we really have some devices where we -know- we can hide the thing
totally, we can then do a quirk for these.

We may be taking a small risk here, but what else do you propose ? The
problem of dangling BARs is real... One option is for me to address it
on powerpc and leave x86 alone as it might be more of an issue with
random crazy embedded firmware for us than it is for x86. As I said, the
workaround is completely self contained within arch code for now.

> Supporting pci_enable_device_io / pci_enable_device_mmio / pci_iomap_io /
> pci_iomap_mmio seems to cover pretty much all the use cases we have.
>
> The users we have right now that are:
>
> - pata_cs5520 (can be dealt with easily)
> - old IDE (with the new resource handling for legacy IDE
> can use pci_enable_device_io I think, ditto pci/cs5520)
> - scx200_acb (looks like a simple substitution works)
> - lpfc pci_enable_device_mmio
> - qla2xxx pci_enable_device ? (enables IO and MMIO)

Ok.

Ben.

2007-12-13 11:20:48

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Possible issue with dangling PCI BARs


> > Supporting pci_enable_device_io / pci_enable_device_mmio / pci_iomap_io /
> > pci_iomap_mmio seems to cover pretty much all the use cases we have.
> >
> > The users we have right now that are:
> >
> > - pata_cs5520 (can be dealt with easily)
> > - old IDE (with the new resource handling for legacy IDE
> > can use pci_enable_device_io I think, ditto pci/cs5520)
> > - scx200_acb (looks like a simple substitution works)
> > - lpfc pci_enable_device_mmio
> > - qla2xxx pci_enable_device ? (enables IO and MMIO)

I may have not fully undestood you in my previous reply. You are proposing
replacing pci_enable_device_bars() with a pair of pci_enable_device_io/mem ?

I think that would be a good idea indeed.

Cheers,
Ben.

2007-12-13 13:34:06

by Alan

[permalink] [raw]
Subject: Re: Possible issue with dangling PCI BARs

> We may be taking a small risk here, but what else do you propose ? The
> problem of dangling BARs is real... One option is for me to address it
> on powerpc and leave x86 alone as it might be more of an issue with
> random crazy embedded firmware for us than it is for x86. As I said, the
> workaround is completely self contained within arch code for now.

I think the cases like the SIL680 are still covered by the comment below

> > Supporting pci_enable_device_io / pci_enable_device_mmio / pci_iomap_io /
> > pci_iomap_mmio seems to cover pretty much all the use cases we have.
> >
> > The users we have right now that are:
> >
> > - pata_cs5520 (can be dealt with easily)
> > - old IDE (with the new resource handling for legacy IDE
> > can use pci_enable_device_io I think, ditto pci/cs5520)
> > - scx200_acb (looks like a simple substitution works)
> > - lpfc pci_enable_device_mmio
> > - qla2xxx pci_enable_device ? (enables IO and MMIO)
>
> Ok.

If we get those in I can fix up libata in no time and sort out the CS5520
devices. Rest is easy providing Bartlomiej can sort out old IDE core.

2007-12-13 20:10:47

by Jesse Barnes

[permalink] [raw]
Subject: Re: Possible issue with dangling PCI BARs

On Thursday, December 13, 2007 3:20 Benjamin Herrenschmidt wrote:
> > > Supporting pci_enable_device_io / pci_enable_device_mmio /
> > > pci_iomap_io / pci_iomap_mmio seems to cover pretty much all the
> > > use cases we have.
> > >
> > > The users we have right now that are:
> > >
> > > - pata_cs5520 (can be dealt with easily)
> > > - old IDE (with the new resource handling for
> > > legacy IDE can use pci_enable_device_io I think, ditto
> > > pci/cs5520)
> > > - scx200_acb (looks like a simple substitution works)
> > > - lpfc pci_enable_device_mmio
> > > - qla2xxx pci_enable_device ? (enables IO and MMIO)
>
> I may have not fully undestood you in my previous reply. You are
> proposing replacing pci_enable_device_bars() with a pair of
> pci_enable_device_io/mem ?
>
> I think that would be a good idea indeed.

Yeah, that seems like a reasonable compromise. Though in practice I'd
expect the full disable decode approach to work fairly well too. I
mean, if we really end up failing to allocate space for the device with
the root drive on it, there are probably bigger issues than just
failing to get a few bytes of I/O space for it...

OTOH like Robert said, many devices really only need either MMIO or IO
space enabled, not both, so having separate enable/disable routines for
them makes a lot of sense.

Jesse

2007-12-13 20:52:07

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Possible issue with dangling PCI BARs


On Thu, 2007-12-13 at 12:04 -0800, Jesse Barnes wrote:
>
> Yeah, that seems like a reasonable compromise. Though in practice
> I'd
> expect the full disable decode approach to work fairly well too. I
> mean, if we really end up failing to allocate space for the device
> with
> the root drive on it, there are probably bigger issues than just
> failing to get a few bytes of I/O space for it...
>

The really bad scenario would be something like the Sil680 that Alan
talked about setup by a BIOS that "knows" about the unused BAR when
MMIO_EN is not set.

If the device is behind a P2P bridge and the BIOS has set the windows of
that bridge so tightly that there is no room to allocate the MMIO BAR,
then a full disable/full enable would fail on a device that would
otherwise work using only PIO.

However, I'd be curious to see that happening in practice :-)

But I think it's fair enough to do an IO only / MEM only approach. I've
seen cases where IO is just not useable because of other constraints and
so I expect the MEM-only case to be more common, especially on non-x86.

Ben.

2007-12-13 21:02:27

by Ivan Kokshaysky

[permalink] [raw]
Subject: Re: Possible issue with dangling PCI BARs

On Thu, Dec 13, 2007 at 10:20:20PM +1100, Benjamin Herrenschmidt wrote:
> I may have not fully undestood you in my previous reply. You are proposing
> replacing pci_enable_device_bars() with a pair of pci_enable_device_io/mem ?
>
> I think that would be a good idea indeed.

I'm all for it. Never liked pci_enable_device_bars :-)

Ivan.

2007-12-13 21:12:44

by Ivan Kokshaysky

[permalink] [raw]
Subject: Re: Possible issue with dangling PCI BARs

On Fri, Dec 14, 2007 at 07:51:06AM +1100, Benjamin Herrenschmidt wrote:
> If the device is behind a P2P bridge and the BIOS has set the windows of
> that bridge so tightly that there is no room to allocate the MMIO BAR,
> then a full disable/full enable would fail on a device that would
> otherwise work using only PIO.

It won't be a problem with separate io/mmio enable.

> However, I'd be curious to see that happening in practice :-)
>
> But I think it's fair enough to do an IO only / MEM only approach. I've
> seen cases where IO is just not useable because of other constraints and
> so I expect the MEM-only case to be more common, especially on non-x86.

Everybody wants MEM if it's available - it's just faster :-)
So I guess a common case will be

if (pci_enable_device_mmio(dev)) {
/* failed, fallback to IO */
if (pci_enable_device_io(dev))
return -ENODEV;
...
}

Ivan.

2007-12-13 23:10:20

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Possible issue with dangling PCI BARs


On Fri, 2007-12-14 at 00:12 +0300, Ivan Kokshaysky wrote:
> On Fri, Dec 14, 2007 at 07:51:06AM +1100, Benjamin Herrenschmidt wrote:
> > If the device is behind a P2P bridge and the BIOS has set the windows of
> > that bridge so tightly that there is no room to allocate the MMIO BAR,
> > then a full disable/full enable would fail on a device that would
> > otherwise work using only PIO.
>
> It won't be a problem with separate io/mmio enable.
>
> > However, I'd be curious to see that happening in practice :-)
> >
> > But I think it's fair enough to do an IO only / MEM only approach. I've
> > seen cases where IO is just not useable because of other constraints and
> > so I expect the MEM-only case to be more common, especially on non-x86.
>
> Everybody wants MEM if it's available - it's just faster :-)
> So I guess a common case will be

.../...

Right, I'm going to cook up some patch as time permit, maybe not before
next week tho.

Cheers,
Ben.

2007-12-14 11:53:14

by Jon Masters

[permalink] [raw]
Subject: Re: Possible issue with dangling PCI BARs


On Thu, 2007-12-13 at 14:00 +1100, Benjamin Herrenschmidt wrote:
> While reworking the powerpc PCI resource management, to make it more
> x86-like and use more of the generic code in setup-bus.c and
> setup-res.c, I noticed something a bit fishy in the way we deal with
> resources that we failed to assign.

On a tangent here, is there a way (other than pci=rom) to ensure greater
address space/resource space for PCI resource mappings?

Jon.

2007-12-14 22:13:07

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: Possible issue with dangling PCI BARs


On Fri, 2007-12-14 at 06:52 -0500, Jon Masters wrote:
> On Thu, 2007-12-13 at 14:00 +1100, Benjamin Herrenschmidt wrote:
> > While reworking the powerpc PCI resource management, to make it more
> > x86-like and use more of the generic code in setup-bus.c and
> > setup-res.c, I noticed something a bit fishy in the way we deal with
> > resources that we failed to assign.
>
> On a tangent here, is there a way (other than pci=rom) to ensure greater
> address space/resource space for PCI resource mappings?

limit lowmem ? 4g/4g ? 64 bits ? :-)

Ben

2007-12-15 02:19:24

by Jon Masters

[permalink] [raw]
Subject: Re: Possible issue with dangling PCI BARs


On Sat, 2007-12-15 at 09:11 +1100, Benjamin Herrenschmidt wrote:
> On Fri, 2007-12-14 at 06:52 -0500, Jon Masters wrote:
> > On Thu, 2007-12-13 at 14:00 +1100, Benjamin Herrenschmidt wrote:
> > > While reworking the powerpc PCI resource management, to make it more
> > > x86-like and use more of the generic code in setup-bus.c and
> > > setup-res.c, I noticed something a bit fishy in the way we deal with
> > > resources that we failed to assign.
> >
> > On a tangent here, is there a way (other than pci=rom) to ensure greater
> > address space/resource space for PCI resource mappings?
>
> limit lowmem ? 4g/4g ? 64 bits ? :-)

Oh, I've seen resource allocation fail on x86_64, which is what bothers
me, since this really shouldn't be happening.

Jon.