2013-08-29 21:09:59

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v8 6/6] PCI: update device mps when doing pci hotplug

[+cc linux-kernel, since more folks might be interested]

On Mon, Aug 26, 2013 at 6:39 PM, Yinghai Lu <[email protected]> wrote:
> On Mon, Aug 26, 2013 at 2:33 PM, Bjorn Helgaas <[email protected]> wrote:
>> The current Linux default is PCIE_BUS_TUNE_OFF, and given that I don't
>> want to touch any MPS settings in that mode, I don't see a way to
>> safely fix https://bugzilla.kernel.org/show_bug.cgi?id=60671 (the
>> problem with hot-added devices not working because MPS is incorrect).
>> In the long term, I hope we can fix it by making the default
>> PCIE_BUS_SAFE, but that doesn't help right now.
>>
>> That leaves us with only the workaround of booting the Huawei rh5885
>> box with "pci=pcie_bus_safe".
>>
>> I'm willing to accept that because I think we can argue that this is
>> really a BIOS defect. The BIOS *can* program MPS to values that will
>> be safe for hotplug even if the OS does nothing, i.e., it can set
>> MPS=128 in all paths that lead to a hotpluggable slot. I think that's
>> probably what this BIOS *should* do, since it has no way of knowing
>> whether the OS will support hotplug or whether the OS will reprogram
>> any MPS values.
>
> BIOS should have several settings for MPS:
> 1. 128
> 2. auto or performance.
>
> When it set to Auto, Linux will have problem with hot-add.
>
> Default one was 128 before, that is ok,
> as from sndbrige and ivbridge, chipset could support more than 128.
>
> BIOS want to set it auto.
> BIOS guys is claiming that other OSes are ok with Auto, but only Linux
> has problem.

I don't understand the argument the BIOS guys are making.

1. If the BIOS sets MPS=128 for all devices all the time, everything
should always work. Performance won't be optimal, but it should
always work.

This requires no OS support at all, so the current Linux default of
doing nothing is fine.

2. If the BIOS sets MPS to something larger than 128 for hardwired
devices only (where no hotplug is possible), the BIOS either knows
that the root complex will split peer-to-peer packets as required (sec
1.3.1), or it assumes there will be no peer-to-peer traffic between
hierarchies.

This should also work fine with no OS support, unless we do
peer-to-peer traffic and the root complex doesn't split packets.

3. If the BIOS sets MPS to something larger than 128 in a path that
leads to a hotplug slot, the BIOS assumes the OS actively manages MPS
for hotplug.

This requires OS support because if we hot-add a device that only
supports MPS=128, the OS must reprogram the upstream path before using
the device.

I don't know what the BIOS "auto" setting means, but it must mean
something in case 3, because that's the only case where OS support is
required. But if the OS is smart enough to manage MPS for hot-added
devices, why can't the OS also program MPS for the whole system at
boot-time?

That's why I don't understand what BIOS wants to do. It sounds like
they want the performance benefit of larger MPS for devices present in
hot-plug slots at boot-time, even if the OS doesn't actively manage
MPS and things blow up if that device is replaced with one that
supports a smaller MPS. That choice doesn't make sense.

In case 3, with a non-MPS-aware OS, you get better performance for a
while, but blow up if a card is replaced. And with an MPS-aware OS,
there should be no advantage to case 3: the OS should be able to get
good performance by programming MPS itself, even without help from the
BIOS.

Bjorn


2013-08-29 21:47:11

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH v8 6/6] PCI: update device mps when doing pci hotplug

On Thu, Aug 29, 2013 at 2:09 PM, Bjorn Helgaas <[email protected]> wrote:
> [+cc linux-kernel, since more folks might be interested]

> I don't know what the BIOS "auto" setting means, but it must mean
> something in case 3, because that's the only case where OS support is
> required. But if the OS is smart enough to manage MPS for hot-added
> devices, why can't the OS also program MPS for the whole system at
> boot-time?
>
> That's why I don't understand what BIOS wants to do. It sounds like
> they want the performance benefit of larger MPS for devices present in
> hot-plug slots at boot-time, even if the OS doesn't actively manage
> MPS and things blow up if that device is replaced with one that
> supports a smaller MPS. That choice doesn't make sense.
>
> In case 3, with a non-MPS-aware OS, you get better performance for a
> while, but blow up if a card is replaced. And with an MPS-aware OS,
> there should be no advantage to case 3: the OS should be able to get
> good performance by programming MPS itself, even without help from the
> BIOS.

With OS default setting on case 3, other two OS are ok with hotplug,
but Linux does not.

Yinghai

2013-08-29 22:22:45

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v8 6/6] PCI: update device mps when doing pci hotplug

On Thu, Aug 29, 2013 at 3:47 PM, Yinghai Lu <[email protected]> wrote:
> On Thu, Aug 29, 2013 at 2:09 PM, Bjorn Helgaas <[email protected]> wrote:
>> [+cc linux-kernel, since more folks might be interested]
>
>> I don't know what the BIOS "auto" setting means, but it must mean
>> something in case 3, because that's the only case where OS support is
>> required. But if the OS is smart enough to manage MPS for hot-added
>> devices, why can't the OS also program MPS for the whole system at
>> boot-time?
>>
>> That's why I don't understand what BIOS wants to do. It sounds like
>> they want the performance benefit of larger MPS for devices present in
>> hot-plug slots at boot-time, even if the OS doesn't actively manage
>> MPS and things blow up if that device is replaced with one that
>> supports a smaller MPS. That choice doesn't make sense.
>>
>> In case 3, with a non-MPS-aware OS, you get better performance for a
>> while, but blow up if a card is replaced. And with an MPS-aware OS,
>> there should be no advantage to case 3: the OS should be able to get
>> good performance by programming MPS itself, even without help from the
>> BIOS.
>
> With OS default setting on case 3, other two OS are ok with hotplug,
> but Linux does not.

I'm not disputing that. I said plainly that in case 3, things blow up
if the OS doesn't actively manage MPS. By default Linux doesn't touch
MPS, so it blows up if a card is replaced.

I am suggesting that it doesn't make any sense for a BIOS to use case
3. Please make an argument for why it *does* make sense to use case
3.

Note that I think Linux *should* eventually actively manage MPS, and
when it does, case 3 should "just work". I just don't understand what
the point of the BIOS using case 3 is.

I suppose other OSes must get better performance in this "auto" mode?
(What exactly is that mode, anyway?) That means the other OS must be
smart enough to deal with hotplug device replacement, but not smart
enough to configure MPS all by itself starting from scratch. I don't
know what rules would tell us "this MPS must be configured by the BIOS
and the OS should leave it alone" and "the OS must configure MPS on
this device for hotplug." How can we make sense out of that?

Bjorn

2013-08-29 22:46:13

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH v8 6/6] PCI: update device mps when doing pci hotplug

On Thu, Aug 29, 2013 at 3:22 PM, Bjorn Helgaas <[email protected]> wrote:
>
> Note that I think Linux *should* eventually actively manage MPS, and
> when it does, case 3 should "just work". I just don't understand what
> the point of the BIOS using case 3 is.
>
> I suppose other OSes must get better performance in this "auto" mode?

Yes.

> (What exactly is that mode, anyway?) That means the other OS must be
> smart enough to deal with hotplug device replacement, but not smart
> enough to configure MPS all by itself starting from scratch. I don't
> know what rules would tell us "this MPS must be configured by the BIOS
> and the OS should leave it alone" and "the OS must configure MPS on
> this device for hotplug." How can we make sense out of that?

So my suggestion:
We scan mps of in the bridges to find out if any is set to other than 128.
if there is any bridge that mps is not 128 and it is hotplug slot.
We change to PCI_BUS_TUNE_PERF for that system.

Yinghai

2013-08-30 15:41:23

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v8 6/6] PCI: update device mps when doing pci hotplug

On Thu, Aug 29, 2013 at 4:46 PM, Yinghai Lu <[email protected]> wrote:
> On Thu, Aug 29, 2013 at 3:22 PM, Bjorn Helgaas <[email protected]> wrote:
>>
>> Note that I think Linux *should* eventually actively manage MPS, and
>> when it does, case 3 should "just work". I just don't understand what
>> the point of the BIOS using case 3 is.
>>
>> I suppose other OSes must get better performance in this "auto" mode?
>
> Yes.

My take on this is that "auto" mode really means "Windows" mode,
because it's a BIOS workaround for shortcomings in Windows. It's
tailored to do system-wide MPS configuration that Windows doesn't do,
while relying on Windows to do minimal reconfiguration after a
hot-plug. I don't feel any particular urgency to make Linux work with
that.

In my opinion, a BIOS should configure the machine in the safest
possible way. Then everything works, and if we boot an OS that is
smart enough to reconfigure it in a more optimal way, that's great,
but it's not required. For MPS, I think that means configuring the
machine as I outlined in case 1 (MPS=128 always) or case 2 (larger MPS
allowed on non-hotplug paths if the BIOS knows the root complex splits
packets).

>> (What exactly is that mode, anyway?) That means the other OS must be
>> smart enough to deal with hotplug device replacement, but not smart
>> enough to configure MPS all by itself starting from scratch. I don't
>> know what rules would tell us "this MPS must be configured by the BIOS
>> and the OS should leave it alone" and "the OS must configure MPS on
>> this device for hotplug." How can we make sense out of that?
>
> So my suggestion:
> We scan mps of in the bridges to find out if any is set to other than 128.
> if there is any bridge that mps is not 128 and it is hotplug slot.
> We change to PCI_BUS_TUNE_PERF for that system.

That seems too arbitrary and magic to me. I don't really want to set
modes based on what we discover in the machine. If we're going to do
MPS config, we should just do it right and do it everywhere.

This discussion is not really going anywhere because we don't have any
concrete changes on the table, so I'm going to try to resist
continuing this thread :)

Bjorn