2002-11-03 00:50:37

by Lee, Jung-Ik

[permalink] [raw]
Subject: Patch: 2.5.45 PCI Fixups for PCI HotPlug

Hi Greg,

The following patch changes function scopes only but fixes kernel dump on
Hot-Add of PCI bridge cards.

Thanks,
J.I. Lee
Enterprise Server Tech.
Intel Corp.

arch/i386/pci/fixup.c | 4 ++--
drivers/pci/quirks.c | 50
+++++++++++++++++++++++++-------------------------
2 files changed, 27 insertions(+), 27 deletions(-)


Attachments:
php_fix.diff (7.56 kB)

2002-11-03 07:55:01

by Greg KH

[permalink] [raw]
Subject: Re: Patch: 2.5.45 PCI Fixups for PCI HotPlug

On Sat, Nov 02, 2002 at 04:56:58PM -0800, Lee, Jung-Ik wrote:
> Hi Greg,
>
> The following patch changes function scopes only but fixes kernel dump on
> Hot-Add of PCI bridge cards.

Applied, thanks.

Hm, in looking at this, I know the majority of people who want
CONFIG_HOTPLUG probably do not run with CONFIG_PCI_HOTPLUG as the
hardware's still quite rare. To force those people to keep around all
of the PCI quirks functions and tables after init happens, is a bit
cruel. I wonder if it's time to start having different subsystems
modify __devinit depending on their config variables.

Does this sound like a good idea? If so, I can probably knock up
something for the PCI code pretty easily (yes, I'll keep in mind CardBus
stuff, not all hotplug pci is on servers...)

__pci_devinit anyone? :)

thanks,

greg k-h

2002-11-03 12:25:29

by Adam J. Richter

[permalink] [raw]
Subject: Re: Patch: 2.5.45 PCI Fixups for PCI HotPlug

Greg KH wrote:
>Hm, in looking at this, I know the majority of people who want
>CONFIG_HOTPLUG probably do not run with CONFIG_PCI_HOTPLUG as the
>hardware's still quite rare. To force those people to keep around all
>of the PCI quirks functions and tables after init happens, is a bit
>cruel. I wonder if it's time to start having different subsystems
>modify __devinit depending on their config variables.

Are there PCI bridge cards that use all of those? For
example, I thought that Triton was a series of Intel motherboard
chipsets for 586 processors. Perhaps you only need to keep a
few of those routines.

Jung-Ik: perhaps you could to an lspci and an "lspci -n" on
your machine when the bridge card is plugged in, which should provide
enough information to determine which routines you really need to
keep.

>Does this sound like a good idea? If so, I can probably knock up
>something for the PCI code pretty easily (yes, I'll keep in mind CardBus
>stuff, not all hotplug pci is on servers...)
>
>__pci_devinit anyone? :)

I posted a patch to define __usbinit and cousins to the
linux-kernel mailing list about two and a half years ago. The change
has been sitting in my <linux/init.h> since then. I just did a string
replacement to make the matching pci calls. Here an an untested patch
that shows both.

I believe that, theoretically, all __dev{init,exit}{,func}
should only be __dev<bus>{init,exit}{,func}.

--
Adam J. Richter __ ______________ 575 Oroville Road
[email protected] \ / Milpitas, California 95035
+1 408 309-6081 | g g d r a s i l United States of America
"Free Software For The Rest Of Us."


Attachments:
(No filename) (1.66 kB)
diffs (1.04 kB)
Download all attachments

2002-11-04 17:59:56

by Lee, Jung-Ik

[permalink] [raw]
Subject: RE: Patch: 2.5.45 PCI Fixups for PCI HotPlug

> Greg KH wrote:
> >Hm, in looking at this, I know the majority of people who want
> >CONFIG_HOTPLUG probably do not run with CONFIG_PCI_HOTPLUG as the
> >hardware's still quite rare. To force those people to keep
> around all
> >of the PCI quirks functions and tables after init happens, is a bit
> >cruel. I wonder if it's time to start having different subsystems
> >modify __devinit depending on their config variables.
>
> Are there PCI bridge cards that use all of those? For
> example, I thought that Triton was a series of Intel motherboard
> chipsets for 586 processors. Perhaps you only need to keep a
> few of those routines.
>
> Jung-Ik: perhaps you could to an lspci and an "lspci -n" on
> your machine when the bridge card is plugged in, which should provide
> enough information to determine which routines you really need to
> keep.

That sounds a quick fix for now but Greg's __pci_devinit seems to be the
right solution.

thanks,
J.I.

2002-11-04 18:16:53

by Adam J. Richter

[permalink] [raw]
Subject: RE: Patch: 2.5.45 PCI Fixups for PCI HotPlug

>> Greg KH wrote:
>> >Hm, in looking at this, I know the majority of people who want
>> >CONFIG_HOTPLUG probably do not run with CONFIG_PCI_HOTPLUG as the
>> >hardware's still quite rare. To force those people to keep
>> around all
>> >of the PCI quirks functions and tables after init happens, is a bit
>> >cruel. I wonder if it's time to start having different subsystems
>> >modify __devinit depending on their config variables.
>>
>> Are there PCI bridge cards that use all of those? For
>> example, I thought that Triton was a series of Intel motherboard
>> chipsets for 586 processors. Perhaps you only need to keep a
>> few of those routines.
>>
>> Jung-Ik: perhaps you could to an lspci and an "lspci -n" on
>> your machine when the bridge card is plugged in, which should provide
>> enough information to determine which routines you really need to
>> keep.

>That sounds a quick fix for now but Greg's __pci_devinit seems to be the
>right solution.

There is no reason to use __pci_devinit for chipsets that are
only soldered into motherboards. I believe there are only a few of
those quirk handlers that CONFIG_PCI_HOTPLUG users really need to
retain in their kernels.

I would appreciate it if you would do the lspci commands I requested
or just send the contents of your /proc/bus/pci/devices file.

Adam J. Richter __ ______________ 575 Oroville Road
[email protected] \ / Milpitas, California 95035
+1 408 309-6081 | g g d r a s i l United States of America
"Free Software For The Rest Of Us."



2002-11-04 18:37:25

by Alan

[permalink] [raw]
Subject: RE: Patch: 2.5.45 PCI Fixups for PCI HotPlug

On Mon, 2002-11-04 at 18:23, Adam J. Richter wrote:
> There is no reason to use __pci_devinit for chipsets that are
> only soldered into motherboards. I believe there are only a few of
> those quirk handlers that CONFIG_PCI_HOTPLUG users really need to
> retain in their kernels.

You might be suprised what "motherboard" chips turn up in docking
stations.My TP600 for example has an IBM southbridge in it.

2002-11-04 20:23:15

by Adam J. Richter

[permalink] [raw]
Subject: RE: Patch: 2.5.45 PCI Fixups for PCI HotPlug

Alan Cox writes:

>On Mon, 2002-11-04 at 18:23, Adam J. Richter wrote:
>> There is no reason to use __pci_devinit for chipsets that are
>> only soldered into motherboards. I believe there are only a few of
>> those quirk handlers that CONFIG_PCI_HOTPLUG users really need to
>> retain in their kernels.

>You might be suprised what "motherboard" chips turn up in docking
>stations.My TP600 for example has an IBM southbridge in it.

What are you advocating? Do you want all quirk routines
marked __devinit? Wouldn't you agree at least that if we can
establish that device *cannot* be hot plugged, that its quirk routine
can be __init?

I see at least three levels of "hardware reality support":

1. Configurations known to be in use.
2. Possible configurations of hardware products known to exist
3. Hardware products not known to exist, based on chips
and specs known to exist in other real hardware products.

Jeff Garzik recently said that he doesn't want to incorporate
hot plugging for arbitrary standard PCI cards until he hears about a
hotplug configuration that uses that card (I'm cc'ing Jeff so he can
correct me if I misunderstand). That would put the kernel at level 1
in the above list.

You seem to be advocating the third level, but maybe I'm
reading too much into your note.

For what it's worth, although I think that individual
trade-offs can lead to some drivers being exceptions in one direction
or another, I would prefer level 2 for odd kernel versions so that we
can get reports about things like your TP600's chip set. I do think
that patches for hotplug support for arbitrary regular PCI card should
be accepted now that there are backplanes that can hot plug them.

By the way, does your TP600 docking station use one of the
mechanisms that is included by CONFIG_PCI_HOTPLUG or is it compiled in
either way (like CardBus)?

If:
- Your docking station were not controlled by CONFIG_PCI_HOTPLUG
- and your kernel were not configured CONFIG_PCI_HOTPLUG
- and your docking station needed a quirk routine
- and the quirk routine were marked with the proposed __pcidevinit
- and you were to plug in your station after boot

...then you would get a kernel segment fault, even if you have
CONFIG_HOTPLUG defined.

Adam J. Richter __ ______________ 575 Oroville Road
[email protected] \ / Milpitas, California 95035
+1 408 309-6081 | g g d r a s i l United States of America
"Free Software For The Rest Of Us."

2002-11-04 20:38:56

by Jeff Garzik

[permalink] [raw]
Subject: Re: Patch: 2.5.45 PCI Fixups for PCI HotPlug

Adam J. Richter wrote:

> Jeff Garzik recently said that he doesn't want to incorporate
>hot plugging for arbitrary standard PCI cards until he hears about a
>hotplug configuration that uses that card (I'm cc'ing Jeff so he can
>correct me if I misunderstand). That would put the kernel at level 1
>in the above list.
>

No, I was talking more specifically about de2104x driver.

For general drivers I prefer __devinit out of general laziness: if
there is not an active and kernel-clueful maintainer, then it's better
to mark everything __devinit [with the wastage it implies]. But for a
few drivers with obviously provable/disprovable cases, __init may be better.

So it IMO matters WRT maintainer and testability issues as well as
strictly kernel/technical issues.

Jeff




2002-11-05 11:55:52

by Ivan Kokshaysky

[permalink] [raw]
Subject: Re: Patch: 2.5.45 PCI Fixups for PCI HotPlug

On Mon, Nov 04, 2002 at 12:29:37PM -0800, Adam J. Richter wrote:
> What are you advocating? Do you want all quirk routines
> marked __devinit? Wouldn't you agree at least that if we can
> establish that device *cannot* be hot plugged, that its quirk routine
> can be __init?

You cannot mark individual quirk routines differently as long as they
belong in the same quirk list. If the list is __devinitdata and some
of routines in it are __init, you'll have an oops in the hotplug path.

What we need is an additional quirk list, say, "hotplug_pci_fixups"
and a global flag "init_gone" (probably free_initmem() should set it).
Then we'll have

void pci_fixup_device(int pass, struct pci_dev *dev)
{
- pci_do_fixups(dev, pass, pcibios_fixups);
- pci_do_fixups(dev, pass, pci_fixups);
+ if (!init_gone) {
+ pci_do_fixups(dev, pass, pcibios_fixups);
+ pci_do_fixups(dev, pass, pci_fixups);
+ }
+ pci_do_fixups(dev, pass, hotplug_pci_fixups);
}

Perhaps arch specific "hotplug_pcibios_fixups" is also needed, as archs
may work around the same pci bug differently.

What should go into the new list is another story. Obviously the fixups for
host bridges must be __init, as well as some south bridges that look like
pci devices but actually aren't.

Ivan.

2002-11-05 13:10:58

by Adam J. Richter

[permalink] [raw]
Subject: Re: Patch: 2.5.45 PCI Fixups for PCI HotPlug

Ivan Kokshaysky wrote:
>On Mon, Nov 04, 2002 at 12:29:37PM -0800, Adam J. Richter wrote:
>You cannot mark individual quirk routines differently as long as they
>belong in the same quirk list. If the list is __devinitdata and some
>of routines in it are __init, you'll have an oops in the hotplug path.

If pci_do_fixups determines that f->vendor and f->device do
not match the device in question, it will not call the corresponding
f->hook, so it is OK for that f->hook to point to the address of a
discarded __init routine that now contains garbage as long as the
ID's will not match any hot plugged device.

Adam J. Richter __ ______________ 575 Oroville Road
[email protected] \ / Milpitas, California 95035
+1 408 309-6081 | g g d r a s i l United States of America
"Free Software For The Rest Of Us."

2002-11-05 14:22:21

by Ivan Kokshaysky

[permalink] [raw]
Subject: Re: Patch: 2.5.45 PCI Fixups for PCI HotPlug

On Tue, Nov 05, 2002 at 05:17:10AM -0800, Adam J. Richter wrote:
> If pci_do_fixups determines that f->vendor and f->device do
> not match the device in question, it will not call the corresponding
> f->hook, so it is OK for that f->hook to point to the address of a
> discarded __init routine that now contains garbage as long as the
> ID's will not match any hot plugged device.

Unless f->device == PCI_ANY_ID. Plus you won't be able to discard
the list itself.

Ivan.

2002-11-06 00:07:47

by Adam J. Richter

[permalink] [raw]
Subject: Re: Patch: 2.5.45 PCI Fixups for PCI HotPlug

Ivan Kokshaysky wrote:
>On Tue, Nov 05, 2002 at 05:17:10AM -0800, Adam J. Richter wrote:
>> If pci_do_fixups determines that f->vendor and f->device do
>> not match the device in question, it will not call the corresponding
>> f->hook, so it is OK for that f->hook to point to the address of a
>> discarded __init routine that now contains garbage as long as the
>> ID's will not match any hot plugged device.
^^^^^

>Unless f->device == PCI_ANY_ID.

I said "match", not "equal."


>Plus you won't be able to discard the list itself.

Your original claim was:

| You cannot mark individual quirk routines differently as long as they
| belong in the same quirk list. If the list is __devinitdata and some
| of routines in it are __init, you'll have an oops in the hotplug path.

I think that that has been disproven and you are now arguing a
different claim: that while the list *can* safely be __devinitdata
with some __init routines, it might save some space to split the lists
so the list that hold the __init routines can also be __initdata. I'd
just like to be clear so that we understand that your proposal no
longer has the urgency of being needed to solve Jung-Ik's crash. That
is just a matter of changing one or two routines to __devinit and
pci{,bios}_fixups[] to __devinitdata (or to __pcidevinit{,data} if you
want to add that facility).

Given that we're now examining the trade-offs of an optional
change, let's try to quantify it, starting with the benefits.

sizeof(struct pci_fixup) == 12 or 16 bytes

pci_fixups[] has 42 entries == 504 or 672 bytes

i386 pcibios_fixups[] has 17 entries == 204 bytes

If half of these entries would remain __init, so that would
save 354 bytes on i386. Probably more than half can remain __init.
So, on x86, we would expect that ~250-350 bytes of table space could
remain in __initdata. You will also gain a negligible speed
improvement in PCI hotplug insertion (not really a critical path, but
I mention it for completeness).

Now let's look at the costs. The system_running global
variable will already tell you if __init{,data} are no longer valid,
so there is no need to add a new variable, your proposed routine
would become:

static struct pci_fixup pci_fixups_initial[] __initdata = {
...
};

static struct pci_fixup pci_fixups[] __pcidevinitdata = {
...
};

void __pcidevinit pci_fixup_device(int pass, struct pci_dev *dev)
{
pci_do_fixups(dev, pass, pcibios_fixups);
if (!system_running)
pci_do_fixups(dev, pass, pci_fixups_initial);
pci_do_fixups(dev, pass, pci_fixups);
}

(Change __pcidevinit{,data} to __devinit{data,} if that facility
is not added.)

Your additional code will probably be ~30 bytes. We do not
have to count the space for the terminating entries for the additional
tables, because those would be __init. There may be some implicit
ordering beyond PCI_FIXUP_{HEADER,FINAL} for a routine or two that you
will need to duplicate between the __init and __devinit tables or
accomodate in some other way, so let's assume 20 bytes there.

Subtracting the negatives from the positives, your proposal
would probably save ~300 bytes on x86, or ~200 bytes if you only split
pci_fixups[] (as above), more as additional __init fixups appear or if
struct pci_fixup grows.

There is also the question of whether you want to have a third
set of tables for __pcidevinitdata routines. Probably, if you did the
carbus_legacy call separately, you would only need __initdata and
__pcidevinitdata tables, so CONFIG_PCI_HOTPLUG users would keep
one table and regular CONFIG_PCI users would keep none, saving
them another 200-300 bytes (i.e., getting back to the current
utilization of no tables after ).

I like the idea of your change. For what it's worth,
I recommend the following.

1. Immediately change pci{,bios}_fixups[] to __devinitdata
(or __pcidevinitdata if that facility is added, as it appears
that CardBus insertion calls pci_insert_device directly, skipping
pci_fixup_device). This is a real bug. pci_fixup_device is a
__devinit routine walking an __initdata table.

2. If Jung-Ik gets around to posting his lspci or
/proc/bus/pci/devices, change his particular fixup routines to
__devinit.

3. If you want to pursue splitting pci_fixups into
__initdata and __pcidevinitdata (or __devinitdata) as an
optimization, fine. I'd recommend that you start with just
splitting pci_fixups[] so you do not have to coordinate with
a bunch of architectures initially. It's simple and I think
200 bytes for CONFIG_PCI_HOTPLUG users is worth one "if" and
splitting one table. It would not surprise me if Linus or,
better, whoever you go through to feed patches to Linus (I
guess Greg is volunteering for that) might want to wait
until after the "freeze", although I have no objection to
splitting pci_fixups[] now.

Adam J. Richter __ ______________ 575 Oroville Road
[email protected] \ / Milpitas, California 95035
+1 408 309-6081 | g g d r a s i l United States of America
"Free Software For The Rest Of Us."