2014-07-03 16:46:02

by Federico Vaga

[permalink] [raw]
Subject: PCIe bus enumeration

Hello,

(I haven't a deep knowledge of the PCIe specification, maybe I'm just
missing something)

is there a way to force the PCI subsystem to assign a bus-number to
every PCIe bridge, even if there is nothing connected?


My aim is to have a bus enumeration constant and independent from what
I plugged on the system. So, I can associate a physical slot to linux
device address bb:dd.f. Is it possible?

I can do the mapping with a simple shell script by discovering the
"new" bus number, but I'm wondering if there is a way to have a
constant bus enumeration.



My Humble Observation
---------------------
It seems (to me) that for PCI the kernel assigns a bus-number to every
PCI bridges and sub-bridges even if there is nothing connected:


e.g. from lspci -t

[...]
+-1e.0-[04-05]----0c.0-[05]--

00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 92)
04:0c.0 PCI bridge: Texas Instruments PCI2050 PCI-to-PCI Bridge (rev
02)


The behavior on PCIe seems different. When there is nothing plugged on
a bus, then the kernel doesn't assign any bus-number and it doesn't
detect any PCI-Bridge at all. So, when I reboot the system with a new
PCIe card the bus enumeration may change.


I tried to use the following pci kernel parameters:

assign-busses : because I want to force the kernel to re-enumerate the
busses, hopefully _all_ buses even if they are empty.

pcie_scan_all : not clear the explanation, but it sounds like it tells
to the kernel to inspect everything.

bfsort : because, maybe, for a bfsort it must assign a number to each
bridge at the same level before inspect the next one.

noacpi : in order to scan independently from BIOS information


The result is always the same (empty buses are not enumerated).



Thank you :)

--
Federico Vaga


2014-07-03 19:43:37

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: PCIe bus enumeration

On Thu, Jul 3, 2014 at 10:45 AM, Federico Vaga <[email protected]> wrote:
> Hello,
>
> (I haven't a deep knowledge of the PCIe specification, maybe I'm just
> missing something)
>
> is there a way to force the PCI subsystem to assign a bus-number to
> every PCIe bridge, even if there is nothing connected?
>
>
> My aim is to have a bus enumeration constant and independent from what
> I plugged on the system. So, I can associate a physical slot to linux
> device address bb:dd.f. Is it possible?

The /sys/bus/pci/slots/*/address files might help. On my system, I have:

$ grep . /sys/bus/pci/slots/*/address /dev/null
/sys/bus/pci/slots/5/address:0000:03:00

"lspci -v" also shows:

03:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd.
Device 5227 (rev 01)
Physical Slot: 5

If you want to start with a physical slot number and figure out the
bb.dd associated with it, the /sys/bus/pci/slots files are probably
the most straightforward way.

> I can do the mapping with a simple shell script by discovering the
> "new" bus number, but I'm wondering if there is a way to have a
> constant bus enumeration.
>
>
>
> My Humble Observation
> ---------------------
> It seems (to me) that for PCI the kernel assigns a bus-number to every
> PCI bridges and sub-bridges even if there is nothing connected:
>
>
> e.g. from lspci -t
>
> [...]
> +-1e.0-[04-05]----0c.0-[05]--
>
> 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 92)
> 04:0c.0 PCI bridge: Texas Instruments PCI2050 PCI-to-PCI Bridge (rev
> 02)

Yes. I think you're talking about the bridge "secondary bus number".
In this case the 04:0c.0 bridge has secondary bus 05, and there are no
devices on bus 05.

> The behavior on PCIe seems different. When there is nothing plugged on
> a bus, then the kernel doesn't assign any bus-number and it doesn't
> detect any PCI-Bridge at all. So, when I reboot the system with a new
> PCIe card the bus enumeration may change.

I don't think the behavior should be different on PCIe, but maybe if
you have an example, it will help me figure out why it is different.
My current machine has three Root Ports (which are treated as
PCI-to-PCI bridges), and they all have secondary bus numbers assigned,
even though only two have devices below them:

+-1c.0-[01]--
+-1c.3-[02]----00.0
+-1c.5-[03]----00.0

We have to assign a secondary bus number in order to enumerate below
the bridge. We can't even tell whether the bus is empty until we
enumerate it. We should assign a secondary bus number, then enumerate
the secondary bus (possibly finding nothing). If we don't find
anything, I think we currently leave the secondary bus number assigned
even though the bus is empty.

Bjorn

2014-07-03 20:36:26

by Federico Vaga

[permalink] [raw]
Subject: Re: PCIe bus enumeration

(Sorry for double emailing, a sw update changes my configuration to
HTML email as default.So, the linux kernel mailing list complains that
probably I'm spamming)

On Thursday 03 July 2014 13:43:14 Bjorn Helgaas wrote:
> On Thu, Jul 3, 2014 at 10:45 AM, Federico Vaga
<[email protected]> wrote:
> > Hello,
> >
> > (I haven't a deep knowledge of the PCIe specification, maybe I'm
> > just missing something)
> >
> > is there a way to force the PCI subsystem to assign a bus-number
> > to
> > every PCIe bridge, even if there is nothing connected?
> >
> >
> > My aim is to have a bus enumeration constant and independent from
> > what I plugged on the system. So, I can associate a physical slot
> > to linux device address bb:dd.f. Is it possible?

More information that I forgot to add. I'm working on kernel 3.2 and
3.6.

> The /sys/bus/pci/slots/*/address files might help. On my system, I
> have:
>
> $ grep . /sys/bus/pci/slots/*/address /dev/null
> /sys/bus/pci/slots/5/address:0000:03:00

My slots directory is empty on 3.2, 3.6, 3.14. I have to compile the
kernel with a
particular configuration? Use a kernel parameter?

> "lspci -v" also shows:
>
> 03:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd.
> Device 5227 (rev 01)
> Physical Slot: 5

My lspci hasn't the "Physical Slot" field. However, where does it take
this information?
>From the BIOS I suppose, a recent BIOS.

So if you look at your motherboard you can identify the which is the
slot 5

> If you want to start with a physical slot number and figure out the
> bb.dd associated with it, the /sys/bus/pci/slots files are probably
> the most straightforward way.
>
> > I can do the mapping with a simple shell script by discovering the
> > "new" bus number, but I'm wondering if there is a way to have a
> > constant bus enumeration.
> >
> >
> >
> > My Humble Observation
> > ---------------------
> > It seems (to me) that for PCI the kernel assigns a bus-number to
> > every PCI bridges and sub-bridges even if there is nothing
> > connected:
> >
> >
> > e.g. from lspci -t
> >
> > [...]
> > +-1e.0-[04-05]----0c.0-[05]--
> >
> > 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 92)
> > 04:0c.0 PCI bridge: Texas Instruments PCI2050 PCI-to-PCI Bridge
> > (rev 02)
>
> Yes. I think you're talking about the bridge "secondary bus
> number". In this case the 04:0c.0 bridge has secondary bus 05, and
> there are no devices on bus 05.

yep

> > The behavior on PCIe seems different. When there is nothing
> > plugged on a bus, then the kernel doesn't assign any bus-number
> > and it doesn't detect any PCI-Bridge at all. So, when I reboot
> > the system with a new PCIe card the bus enumeration may change.
>
> I don't think the behavior should be different on PCIe, but maybe if
> you have an example, it will help me figure out why it is
> different. My current machine has three Root Ports (which are
> treated as PCI-to-PCI bridges), and they all have secondary bus
> numbers assigned, even though only two have devices below them:
>
> +-1c.0-[01]--
> +-1c.3-[02]----00.0
> +-1c.5-[03]----00.0

What I observed is that when several PCIe slot belong to a single PCI
Bridge, and you
plug a board in one on these, then it enumerates all secondary buses,
also the
empty ones (like your case, all your slot belong to device 1c).

But, if you un-plug the devices on secondary bus 02 and 03, you should
not see the
device 1c anymore. This is what is happening with my machine
[industrial backplane
with several PCI(e) slots and the motherboard plugged in a special
slot.].

Even on sysfs the device doesn't appear.

> We have to assign a secondary bus number in order to enumerate below
> the bridge. We can't even tell whether the bus is empty until we
> enumerate it.

Yep, I read the code and that's what I understood.

> We should assign a secondary bus number, then
> enumerate the secondary bus (possibly finding nothing). If we
> don't find anything, I think we currently leave the secondary bus
> number assigned even though the bus is empty.

I'll try to check :)


Thank you Bjorn

--
Federico Vaga

2014-07-03 22:04:48

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: PCIe bus enumeration

On Thu, Jul 3, 2014 at 2:40 PM, Federico Vaga <[email protected]> wrote:
> On Thursday 03 July 2014 13:43:14 Bjorn Helgaas wrote:

>> The /sys/bus/pci/slots/*/address files might help. On my system, I
>> have:
>>
>> $ grep . /sys/bus/pci/slots/*/address /dev/null
>> /sys/bus/pci/slots/5/address:0000:03:00
>
> My slots directory is empty on 3.2, 3.6, 3.14. I have to compile the
> kernel with a
> particular configuration? Use a kernel parameter?

Should be built-in, no parameter needed. I think this is from
pci_create_slot() in drivers/pci/slot.c. That's called from
register_slot() (drivers/acpi/pci_slot.c, which obviously depends on
the BIOS) and indirectly from pciehp (which doesn't depend on the BIOS
and reads a slot number from the PCIe capability). "lspci -vv" will
show you this slot number in the SltCap (if the port supports a slot),
e.g.,

00:1c.3 PCI bridge: Intel Corporation Lynx Point-LP PCI Express Root Port 4
Capabilities: [40] Express (v2) Root Port (Slot+), MSI 00
SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd-
HotPlug- Surprise-
Slot #3, PowerLimit 10.000W; Interlock- NoCompl+

Since you don't see these, my guess is that your ports don't indicate
that they support a slot, e.g., they might look like this:

00:1c.3 PCI bridge: Intel Corporation Lynx Point-LP PCI Express Root Port 4
Capabilities: [40] Express (v2) Root Port (Slot-), MSI 00

The "Slot-" means the port doesn't have a slot, and lspci won't show
you the SltCap register, and I think the kernel won't put anything in
/sys/bus/pci/slots.

>> "lspci -v" also shows:
>>
>> 03:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd.
>> Device 5227 (rev 01)
>> Physical Slot: 5
>
> My lspci hasn't the "Physical Slot" field. However, where does it take
> this information?
> From the BIOS I suppose, a recent BIOS.

>From looking at the lspci source
(git://git.kernel.org/pub/scm/utils/pciutils/pciutils.git), it looks
like that "Physical Slot" comes from /sys/bus/pci/slots/..., so if you
don't have anything there, you won't see "Physical Slot".

>> I don't think the behavior should be different on PCIe, but maybe if
>> you have an example, it will help me figure out why it is
>> different. My current machine has three Root Ports (which are
>> treated as PCI-to-PCI bridges), and they all have secondary bus
>> numbers assigned, even though only two have devices below them:
>>
>> +-1c.0-[01]--
>> +-1c.3-[02]----00.0
>> +-1c.5-[03]----00.0
>
> What I observed is that when several PCIe slot belong to a single PCI
> Bridge, and you
> plug a board in one on these, then it enumerates all secondary buses,
> also the
> empty ones (like your case, all your slot belong to device 1c).
>
> But, if you un-plug the devices on secondary bus 02 and 03, you should
> not see the
> device 1c anymore. This is what is happening with my machine
> [industrial backplane
> with several PCI(e) slots and the motherboard plugged in a special
> slot.].

I think there's something unusual going on with your machine. I can't
remove the devices on my machine (a laptop), but normally the Root
Ports or Downstream Ports leading to the slots continue to exist even
if the slots are empty. In your case, it sounds like there's some
hardware that is turning off power to those ports when the slots are
all empty.

I assume these ports don't support hotplug. If they *did* support
hotplug, those ports would have to exist because they handle the
hotplug events (presence detect, etc.)

If you can collect the complete "lspci -vv" output from your machine
(with a device plugged in, so we can see the port leading to it), that
will help make this more concrete. And maybe one with no devices
plugged in, so we can see exactly what changes.

Bjorn

2014-07-04 07:55:33

by Federico Vaga

[permalink] [raw]
Subject: Re: PCIe bus enumeration

> I assume these ports don't support hotplug. If they *did* support
> hotplug, those ports would have to exist because they handle the
> hotplug events (presence detect, etc.)

I asked: yes, they do not support hotplug

> If you can collect the complete "lspci -vv" output from your machine
> (with a device plugged in, so we can see the port leading to it),
> that will help make this more concrete. And maybe one with no
> devices plugged in, so we can see exactly what changes.

I attached two files with the output. I putted a card in slot 10 and
took the output, then moved the card on slot 11 and took the output.

As you can see with diff the bridge behind the slot disappear when it
is empty.

--
Federico Vaga


Attachments:
lspci-slot11-emtpy-slot10-busy (31.42 kB)
lspci-slot11-busy-slot10-empty (31.55 kB)
Download all attachments

2014-07-04 21:26:17

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: PCIe bus enumeration

On Fri, Jul 04, 2014 at 09:55:20AM +0200, Federico Vaga wrote:
> > I assume these ports don't support hotplug. If they *did* support
> > hotplug, those ports would have to exist because they handle the
> > hotplug events (presence detect, etc.)
>
> I asked: yes, they do not support hotplug
>
> > If you can collect the complete "lspci -vv" output from your machine
> > (with a device plugged in, so we can see the port leading to it),
> > that will help make this more concrete. And maybe one with no
> > devices plugged in, so we can see exactly what changes.
>
> I attached two files with the output. I putted a card in slot 10 and
> took the output, then moved the card on slot 11 and took the output.
>
> As you can see with diff the bridge behind the slot disappear when it
> is empty.

Perfect, thanks! For some reason, it really helps me to be able to stare
at the actual data. Here's the situation with slot 10 occupied:

00:01.0 82Q35 Root Port to [bus 05] PCIe SltCap slot #21
05:00.0 CERN/ECP/EDU Device slot 10
00:1c.0 82801I Express Port 1 to [bus 04] PCIe SltCap slot #22
00:1c.3 (not present at all)
00:1c.4 82801I Express Port 5 to [bus 03] PCIe SltCap slot #0
03:00.0 Realtek NIC

and here it is with slot 11 occupied:

00:01.0 (not present at all)
00:1c.0 82801I Express Port 1 to [bus 05] PCIe SltCap slot #22
00:1c.3 82801I Express Port 4 to [bus 04] PCIe SltCap slot #25
04:00.0 CERN/ECP/EDU Device slot 11
00:1c.4 82801I Express Port 5 to [bus 03] PCIe SltCap slot #0
03:00.0 Realtek NIC

I'm pretty sure this is a function of your BIOS. There are often
device-specific ways to enable or disable individual devices (like the root
ports here), and the BIOS is likely disabling these ports when there is
nothing below them. I don't know why it would turn off 00:1c.3 when its
slot is empty, but it doesn't turn off 00:1c.0, which also leads to an
empty slot. But I don't think Linux is involved in this, and if the BIOS
disables devices, there really isn't anything Linux can do about it.

If you can get to an EFI shell on this box, you might be able to confirm
this with the "pci" command. Booting Linux with "pci=earlydump" is similar
in that it dumps PCI config space before we change anything.

To solve this problem, I think you need slot information even when there's
no hotplug. This has been raised before [1, 2], and I think it's a good
idea, but nobody has implemented it yet.

Another curious thing is that you refer to "slot 10", but there's no
obvious connection between that and the "slot 21" in the PCIe capability of
the Root Port leading to that slot. But I guess you said the slots are in
a backplane (they're not an integral part of the motherboard). In that
case, there's no way for the motherboard to know what the labels on the
backplane are.

Bjorn

[1] http://lkml.kernel.org/r/CAErSpo45sDNPt6=Yw-qgqdojYL8+_JNOVNEnVxRLatga+bY+2A@mail.gmail.com
[2] https://bugzilla.kernel.org/show_bug.cgi?id=72681

2014-07-07 07:29:11

by Federico Vaga

[permalink] [raw]
Subject: Re: PCIe bus enumeration

On Friday 04 July 2014 15:26:12 Bjorn Helgaas wrote:
> On Fri, Jul 04, 2014 at 09:55:20AM +0200, Federico Vaga wrote:
> > > I assume these ports don't support hotplug. If they *did*
> > > support
> > > hotplug, those ports would have to exist because they handle the
> > > hotplug events (presence detect, etc.)
> >
> > I asked: yes, they do not support hotplug
> >
> > > If you can collect the complete "lspci -vv" output from your
> > > machine (with a device plugged in, so we can see the port
> > > leading to it), that will help make this more concrete. And
> > > maybe one with no devices plugged in, so we can see exactly
> > > what changes.
> >
> > I attached two files with the output. I putted a card in slot 10
> > and took the output, then moved the card on slot 11 and took the
> > output.
> >
> > As you can see with diff the bridge behind the slot disappear when
> > it is empty.
>
> Perfect, thanks! For some reason, it really helps me to be able to
> stare at the actual data. Here's the situation with slot 10
> occupied:
>
> 00:01.0 82Q35 Root Port to [bus 05] PCIe SltCap slot #21
> 05:00.0 CERN/ECP/EDU Device slot 10
> 00:1c.0 82801I Express Port 1 to [bus 04] PCIe SltCap slot #22
> 00:1c.3 (not present at all)
> 00:1c.4 82801I Express Port 5 to [bus 03] PCIe SltCap slot #0
> 03:00.0 Realtek NIC
>
> and here it is with slot 11 occupied:
>
> 00:01.0 (not present at all)
> 00:1c.0 82801I Express Port 1 to [bus 05] PCIe SltCap slot #22
> 00:1c.3 82801I Express Port 4 to [bus 04] PCIe SltCap slot #25
> 04:00.0 CERN/ECP/EDU Device slot 11
> 00:1c.4 82801I Express Port 5 to [bus 03] PCIe SltCap slot #0
> 03:00.0 Realtek NIC
>
> I'm pretty sure this is a function of your BIOS. There are often
> device-specific ways to enable or disable individual devices (like
> the root ports here), and the BIOS is likely disabling these ports
> when there is nothing below them. I don't know why it would turn
> off 00:1c.3 when its slot is empty, but it doesn't turn off
> 00:1c.0, which also leads to an empty slot. But I don't think
> Linux is involved in this, and if the BIOS disables devices, there
> really isn't anything Linux can do about it.

It seems to happen also on some "classic" PC. I didn't experiment it
by myself, some friends reported me this behavior in the recent past.

So, It looks like that some BIOS disable the bridge when there is
nothing behind it. Why? Power save? :/

> If you can get to an EFI shell on this box, you might be able to
> confirm this with the "pci" command. Booting Linux with
> "pci=earlydump" is similar in that it dumps PCI config space before
> we change anything.

yes I confirm, the bridge are not there if I don't plug the card.

> To solve this problem, I think you need slot information even when
> there's no hotplug. This has been raised before [1, 2], and I
> think it's a good idea, but nobody has implemented it yet.

Yes, but if the BIOS disable the bridge there is nothing we can do.

> Another curious thing is that you refer to "slot 10", but there's no
> obvious connection between that and the "slot 21" in the PCIe
> capability of the Root Port leading to that slot. But I guess you
> said the slots are in a backplane (they're not an integral part of
> the motherboard). In that case, there's no way for the motherboard
> to know what the labels on the backplane are.

It is written on the backplane. I said slot 10 because I'm counting
the available slot, but on the backplane they are 22, 25, and other
no-consecutive numbers.

If I use `biosdecode` I can get that information, but only for the
"first level" of bridges. On some backplane I have PCI bridges behind
bridges, and in this case biosdecode doesn't help: it just tell me
about the bridge on the motherboard.

At the moment, I'm using the PCI bridge address to make the
association with a specific slot. When they are on they have always
the same address. A colleague did a map between physical slot and PCI
bridge address; from this we can extract the bus number and identify
the cards. But well I was looking for better solutions :)

--
Federico Vaga

2014-07-07 17:34:55

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: PCIe bus enumeration

On Mon, Jul 7, 2014 at 1:29 AM, Federico Vaga <[email protected]> wrote:
> On Friday 04 July 2014 15:26:12 Bjorn Helgaas wrote:
>> On Fri, Jul 04, 2014 at 09:55:20AM +0200, Federico Vaga wrote:
>> > > I assume these ports don't support hotplug. If they *did*
>> > > support
>> > > hotplug, those ports would have to exist because they handle the
>> > > hotplug events (presence detect, etc.)
>> >
>> > I asked: yes, they do not support hotplug
>> >
>> > > If you can collect the complete "lspci -vv" output from your
>> > > machine (with a device plugged in, so we can see the port
>> > > leading to it), that will help make this more concrete. And
>> > > maybe one with no devices plugged in, so we can see exactly
>> > > what changes.
>> >
>> > I attached two files with the output. I putted a card in slot 10
>> > and took the output, then moved the card on slot 11 and took the
>> > output.
>> >
>> > As you can see with diff the bridge behind the slot disappear when
>> > it is empty.
>>
>> Perfect, thanks! For some reason, it really helps me to be able to
>> stare at the actual data. Here's the situation with slot 10
>> occupied:
>>
>> 00:01.0 82Q35 Root Port to [bus 05] PCIe SltCap slot #21
>> 05:00.0 CERN/ECP/EDU Device slot 10
>> 00:1c.0 82801I Express Port 1 to [bus 04] PCIe SltCap slot #22
>> 00:1c.3 (not present at all)
>> 00:1c.4 82801I Express Port 5 to [bus 03] PCIe SltCap slot #0
>> 03:00.0 Realtek NIC
>>
>> and here it is with slot 11 occupied:
>>
>> 00:01.0 (not present at all)
>> 00:1c.0 82801I Express Port 1 to [bus 05] PCIe SltCap slot #22
>> 00:1c.3 82801I Express Port 4 to [bus 04] PCIe SltCap slot #25
>> 04:00.0 CERN/ECP/EDU Device slot 11
>> 00:1c.4 82801I Express Port 5 to [bus 03] PCIe SltCap slot #0
>> 03:00.0 Realtek NIC
>>
>> I'm pretty sure this is a function of your BIOS. There are often
>> device-specific ways to enable or disable individual devices (like
>> the root ports here), and the BIOS is likely disabling these ports
>> when there is nothing below them. I don't know why it would turn
>> off 00:1c.3 when its slot is empty, but it doesn't turn off
>> 00:1c.0, which also leads to an empty slot. But I don't think
>> Linux is involved in this, and if the BIOS disables devices, there
>> really isn't anything Linux can do about it.
>
> It seems to happen also on some "classic" PC. I didn't experiment it
> by myself, some friends reported me this behavior in the recent past.
>
> So, It looks like that some BIOS disable the bridge when there is
> nothing behind it. Why? Power save? :/

Could be power savings, or possibly to conserve bus numbers, which are
a limited resource.

>> If you can get to an EFI shell on this box, you might be able to
>> confirm this with the "pci" command. Booting Linux with
>> "pci=earlydump" is similar in that it dumps PCI config space before
>> we change anything.
>
> yes I confirm, the bridge are not there if I don't plug the card.
>
>> To solve this problem, I think you need slot information even when
>> there's no hotplug. This has been raised before [1, 2], and I
>> think it's a good idea, but nobody has implemented it yet.
>
> Yes, but if the BIOS disable the bridge there is nothing we can do.

Well, it's true that it's hard to get constant *bus numbers*, but it's
never really been a good idea to rely on those, because they're
assigned at the discretion of the OS, and there are reasons why the OS
might want to reallocate them, e.g., to accommodate a deep hot-plugged
hierarchy. If you shift focus to *slot numbers*, then I think there's
a lot more we can do.

>> Another curious thing is that you refer to "slot 10", but there's no
>> obvious connection between that and the "slot 21" in the PCIe
>> capability of the Root Port leading to that slot. But I guess you
>> said the slots are in a backplane (they're not an integral part of
>> the motherboard). In that case, there's no way for the motherboard
>> to know what the labels on the backplane are.
>
> It is written on the backplane. I said slot 10 because I'm counting
> the available slot, but on the backplane they are 22, 25, and other
> no-consecutive numbers.

The 22, 25, etc., are in the same range as the slot numbers in the
PCIe Slot Capabilities registers, so maybe the backplane is
constructed to make this possible. The external PCIe chassis I'm
familiar with have one fast link on a cable leading to the box, with a
PCIe switch inside the box. The upstream port is connected to the
incoming link, and there's a downstream port connected to each slot.
In this case, the slot numbers in the downstream ports' Slot
Capabilities registers can be made to match the silkscreen labels on
the board since everything is fixed by the hardware.

Your backplane sounds a little different (you have Ports on the root
bus leading directly to slots in the backplane, so I assume those
Ports are on the motherboard, not the backplane), but maybe the
motherboard & backplane are designed as a unit so the Port slot
numbers could match the backplane.

> If I use `biosdecode` I can get that information, but only for the
> "first level" of bridges. On some backplane I have PCI bridges behind
> bridges, and in this case biosdecode doesn't help: it just tell me
> about the bridge on the motherboard.

What specific biosdecode information are you using? There's a fair
amount of stuff in the PCI-to-PCI bridge spec about slot and chassis
numbering, including some about expansion chassis. I doubt that Linux
implements all that, so there's probably room for a lot of
improvement. I attached your lspci output to the bugzilla
(https://bugzilla.kernel.org/show_bug.cgi?id=72681). Maybe you could
attach the biosdecode info there, too, and we could see if there's a
way we can make this easier.

Bjorn

2014-07-08 07:15:42

by Federico Vaga

[permalink] [raw]
Subject: Re: PCIe bus enumeration

(I'm changing my email address to the work one. Initially it was just
my personal curiosity but now you are helping me with my work, so I
think is correct in this way)

> > So, It looks like that some BIOS disable the bridge when there is
> > nothing behind it. Why? Power save? :/
>
> Could be power savings, or possibly to conserve bus numbers, which
> are a limited resource.

what is the maximum number of buses?

> >> If you can get to an EFI shell on this box, you might be able to
> >> confirm this with the "pci" command. Booting Linux with
> >> "pci=earlydump" is similar in that it dumps PCI config space
> >> before
> >> we change anything.
> >
> > yes I confirm, the bridge are not there if I don't plug the card.
> >
> >> To solve this problem, I think you need slot information even
> >> when
> >> there's no hotplug. This has been raised before [1, 2], and I
> >> think it's a good idea, but nobody has implemented it yet.
> >
> > Yes, but if the BIOS disable the bridge there is nothing we can
> > do.
>
> Well, it's true that it's hard to get constant *bus numbers*, but
> it's never really been a good idea to rely on those, because
> they're assigned at the discretion of the OS, and there are reasons
> why the OS might want to reallocate them, e.g., to accommodate a
> deep hot-plugged hierarchy. If you shift focus to *slot numbers*,
> then I think there's a lot more we can do.

At this point I'm a little bit confused about the definition "slot
numbers" :) You mean the 22, 25, ...

> >> Another curious thing is that you refer to "slot 10", but there's
> >> no obvious connection between that and the "slot 21" in the PCIe
> >> capability of the Root Port leading to that slot. But I guess
> >> you said the slots are in a backplane (they're not an integral
> >> part of the motherboard). In that case, there's no way for the
> >> motherboard to know what the labels on the backplane are.
> >
> > It is written on the backplane. I said slot 10 because I'm
> > counting
> > the available slot, but on the backplane they are 22, 25, and
> > other
> > no-consecutive numbers.
>
> The 22, 25, etc., are in the same range as the slot numbers in the
> PCIe Slot Capabilities registers, so maybe the backplane is
> constructed to make this possible. The external PCIe chassis I'm
> familiar with have one fast link on a cable leading to the box, with
> a PCIe switch inside the box. The upstream port is connected to
> the incoming link, and there's a downstream port connected to each
> slot. In this case, the slot numbers in the downstream ports' Slot
> Capabilities registers can be made to match the silkscreen labels
> on the board since everything is fixed by the hardware.
>
> Your backplane sounds a little different (you have Ports on the root
> bus leading directly to slots in the backplane, so I assume those
> Ports are on the motherboard, not the backplane), but maybe the
> motherboard & backplane are designed as a unit so the Port slot
> numbers could match the backplane.

Yes, the backplane is almost "empty". Except for the 9 PCIe backplane
which has PCI bridges on it. At the moment I cannot check physically
this kind of backplane, but from the lspci output I understand that
there is a bridge on the backplane because the motherboard is the
same.

>
> > If I use `biosdecode` I can get that information, but only for the
> > "first level" of bridges. On some backplane I have PCI bridges
> > behind bridges, and in this case biosdecode doesn't help: it just
> > tell me about the bridge on the motherboard.
>
> What specific biosdecode information are you using?

I was looking at the "PCI interrupt routing", but it seems that it
returns only information about the last bridge in the interrupt's
routing. Here an example with a different backplane (9 PCIe).

It seems fine for backplane without PCI Bridge on the backplane.

I attached two files, one for each type of backplane.


Maybe I'm just misunderstanding the output of biosdecode. I didn't
find an explanation of its output: I'm guessing the meaning.

> There's a fair
> amount of stuff in the PCI-to-PCI bridge spec about slot and chassis
> numbering, including some about expansion chassis. I doubt that
> Linux implements all that, so there's probably room for a lot of
> improvement. I attached your lspci output to the bugzilla
> (https://bugzilla.kernel.org/show_bug.cgi?id=72681). Maybe you
> could attach the biosdecode info there, too, and we could see if
> there's a way we can make this easier.

ok

--
Federico Vaga


Attachments:
biosdecode-1-level-bridges (1.83 kB)
biosdecode-n-level-bridges (1.95 kB)
Download all attachments

2014-07-08 18:24:03

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: PCIe bus enumeration

On Tue, Jul 8, 2014 at 1:15 AM, Federico Vaga <[email protected]> wrote:
>> > So, It looks like that some BIOS disable the bridge when there is
>> > nothing behind it. Why? Power save? :/
>>
>> Could be power savings, or possibly to conserve bus numbers, which
>> are a limited resource.
>
> what is the maximum number of buses?

256.

>> Well, it's true that it's hard to get constant *bus numbers*, but
>> it's never really been a good idea to rely on those, because
>> they're assigned at the discretion of the OS, and there are reasons
>> why the OS might want to reallocate them, e.g., to accommodate a
>> deep hot-plugged hierarchy. If you shift focus to *slot numbers*,
>> then I think there's a lot more we can do.
>
> At this point I'm a little bit confused about the definition "slot
> numbers" :) You mean the 22, 25, ...

Right. Bus numbers are under software control, to some degree (as a
general rule, an x86 BIOS assigns them and Linux leaves them alone,
but they *can* be changed so they aren't a good thing to rely on).
The bus number of a root bus is usually determined by hardware or by
an arch-specific host bridge driver. The bus number below a PCI-PCI
bridge is determined by the bridge's "secondary bus number" register,
which software can change.

Slot numbers are based on the Physical Slot Number in the PCIe Slot
Capability register. This is set by some hardware mechanism such as
pin strapping or a serial EEPROM. Software can't change it, so you
can rely on it to be constant. (There's also a mechanism for getting
a slot number from ACPI, but that should also return a constant
value). The problem is that I don't think the Linux slot number
support is very good, so I'm sure there's plenty of stuff that we
*should* be able to do that we can't do *yet*.

Bjorn

2014-07-08 19:17:45

by Federico Vaga

[permalink] [raw]
Subject: Re: PCIe bus enumeration

On Tuesday 08 July 2014 12:23:39 Bjorn Helgaas wrote:
> On Tue, Jul 8, 2014 at 1:15 AM, Federico Vaga
<[email protected]> wrote:
> >> > So, It looks like that some BIOS disable the bridge when there
> >> > is
> >> > nothing behind it. Why? Power save? :/
> >>
> >> Could be power savings, or possibly to conserve bus numbers,
> >> which
> >> are a limited resource.
> >
> > what is the maximum number of buses?
>
> 256.

Well, it is not a small number. I will ask directly to the company who
sell this crate and ask them what is going on in the BIOS

> > At this point I'm a little bit confused about the definition "slot
> > numbers" :) You mean the 22, 25, ...
>
> Right. Bus numbers are under software control, to some degree (as a
> general rule, an x86 BIOS assigns them and Linux leaves them alone,
> but they *can* be changed so they aren't a good thing to rely on).
> The bus number of a root bus is usually determined by hardware or
> by an arch-specific host bridge driver. The bus number below a
> PCI-PCI bridge is determined by the bridge's "secondary bus number"
> register, which software can change.
>
> Slot numbers are based on the Physical Slot Number in the PCIe Slot
> Capability register. This is set by some hardware mechanism such as
> pin strapping or a serial EEPROM. Software can't change it, so you
> can rely on it to be constant. (There's also a mechanism for
> getting a slot number from ACPI, but that should also return a
> constant value). The problem is that I don't think the Linux slot
> number support is very good, so I'm sure there's plenty of stuff
> that we *should* be able to do that we can't do *yet*.

Mh, I understand. Let's say that I have time to spend on this problem
(I do not know) and contributing to the PCI subsystem. How many
differences are there between 3.2, 3.6, 3.16/next? We are using
3.2/3.6 at the moment, but probably you should expect that it will
work on the last version :)

--
Federico Vaga

2014-07-08 20:27:23

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: PCIe bus enumeration

On Tue, Jul 8, 2014 at 1:20 PM, Federico Vaga <[email protected]> wrote:
> On Tuesday 08 July 2014 12:23:39 Bjorn Helgaas wrote:
>> On Tue, Jul 8, 2014 at 1:15 AM, Federico Vaga
> <[email protected]> wrote:
>> >> > So, It looks like that some BIOS disable the bridge when there
>> >> > is
>> >> > nothing behind it. Why? Power save? :/
>> >>
>> >> Could be power savings, or possibly to conserve bus numbers,
>> >> which
>> >> are a limited resource.
>> >
>> > what is the maximum number of buses?
>>
>> 256.
>
> Well, it is not a small number. I will ask directly to the company who
> sell this crate and ask them what is going on in the BIOS

Yeah, it's not usually a problem until you get to the really big
machines. The BIOS vendor could give you a much better reason; I'm
only speculating.

>> > At this point I'm a little bit confused about the definition "slot
>> > numbers" :) You mean the 22, 25, ...
>>
>> Right. Bus numbers are under software control, to some degree (as a
>> general rule, an x86 BIOS assigns them and Linux leaves them alone,
>> but they *can* be changed so they aren't a good thing to rely on).
>> The bus number of a root bus is usually determined by hardware or
>> by an arch-specific host bridge driver. The bus number below a
>> PCI-PCI bridge is determined by the bridge's "secondary bus number"
>> register, which software can change.
>>
>> Slot numbers are based on the Physical Slot Number in the PCIe Slot
>> Capability register. This is set by some hardware mechanism such as
>> pin strapping or a serial EEPROM. Software can't change it, so you
>> can rely on it to be constant. (There's also a mechanism for
>> getting a slot number from ACPI, but that should also return a
>> constant value). The problem is that I don't think the Linux slot
>> number support is very good, so I'm sure there's plenty of stuff
>> that we *should* be able to do that we can't do *yet*.
>
> Mh, I understand. Let's say that I have time to spend on this problem
> (I do not know) and contributing to the PCI subsystem. How many
> differences are there between 3.2, 3.6, 3.16/next? We are using
> 3.2/3.6 at the moment, but probably you should expect that it will
> work on the last version :)

There are quite a few differences, including a fair amount of work on
the hotplug drivers. The problem in this area is that pciehp (the
PCIe hotplug driver) and acpiphp (the ACPI hotplug driver) both can
register slot numbers and it's sort of ugly to figure out which one to
use in a given situation. Neither can be a loadable module anymore,
which simplifies things a little bit, but it's still ugly.

Bjorn