Ok, I now have two confirmations from independent people (Adam Richter
and Kai Germaschewski) who have completely different hardware, but have
the same problem: irq routing seems to not work for them.
In both cases it is because the PCI device config space already has an
entry for the interrupt, but the interrupt is apparently not actually
routed on the irq router.
WHY this happens is unclear, but it could be several reasons:
- undocumented "Plug'n'Play OS true behaviour"
- BIOS bugs. 'nuff said.
- warm-booting from an OS that _does_ set the interrupt routing,
and also sets the PCI config space thing
The problem can be fairly easily fixed by just removing the test for
whether "pci->dev" has already got set. This, btw, is important for
another reason too - there is nothing that says that a sleep event
wouldn't turn off the irq routing, so we _have_ to have some way of
forcing the interrupt routing to take effect even if we already think we
have the correct irq.
However, Martin is certainly also right in claiming that there might be
bugs the "other" way, and just completely dismissing the irq we already
are claimed to have would be bad.
This is my current suggested patch for the problem.
Adam, Kai, can you verify that it fixes the issues on your systems?
Anybody else who has had problems with PCI interrupt routing (tends to be
"new" devices like CardBus, ACPI, USB etc), can you verify that this
either fixes things or at least doesn't break a setup that had started
working earlier..
Martin, what do you think? We definitely need something like this, but
maybe we could add a few more sanity-tests?
Linus
----
--- v2.4.0-test11/linux/arch/i386/kernel/pci-irq.c Sun Nov 19 18:44:03 2000
+++ linux/arch/i386/kernel/pci-irq.c Tue Dec 5 14:38:13 2000
@@ -405,9 +424,12 @@
DBG(" -> PIRQ %02x, mask %04x, excl %04x", pirq, mask, pirq_table->exclusive_irqs);
mask &= pcibios_irq_mask;
- /* Find the best IRQ to assign */
- newirq = 0;
- if (assign) {
+ /*
+ * Find the best IRQ to assign: use the one
+ * reported by the device if possible.
+ */
+ newirq = dev->irq;
+ if (!newirq && assign) {
for (i = 0; i < 16; i++) {
if (!(mask & (1 << i)))
continue;
@@ -417,16 +439,22 @@
newirq = i;
}
}
- DBG(" -> newirq=%d", newirq);
}
+ DBG(" -> newirq=%d", newirq);
/* Try to get current IRQ */
if (r->get && (irq = r->get(pirq_router_dev, d, pirq))) {
DBG(" -> got IRQ %d\n", irq);
msg = "Found";
+ /* We refuse to override the dev->irq information. Give a warning! */
+ if (dev->irq && dev->irq != irq) {
+ printk("IRQ routing conflict in pirq table! Try 'pci=autoirq'\n");
+ return 0;
+ }
} else if (newirq && r->set && (dev->class >> 8) != PCI_CLASS_DISPLAY_VGA) {
DBG(" -> assigning IRQ %d", newirq);
if (r->set(pirq_router_dev, d, pirq, newirq)) {
+ eisa_set_level_irq(newirq);
DBG(" ... OK\n");
msg = "Assigned";
irq = newirq;
@@ -556,19 +584,17 @@
void pcibios_enable_irq(struct pci_dev *dev)
{
- if (!dev->irq) {
- u8 pin;
- pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin);
- if (pin && !pcibios_lookup_irq(dev, 1)) {
- char *msg;
- if (io_apic_assign_pci_irqs)
- msg = " Probably buggy MP table.";
- else if (pci_probe & PCI_BIOS_IRQ_SCAN)
- msg = "";
- else
- msg = " Please try using pci=biosirq.";
- printk(KERN_WARNING "PCI: No IRQ known for interrupt pin %c of device %s.%s\n",
- 'A' + pin - 1, dev->slot_name, msg);
- }
+ u8 pin;
+ pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin);
+ if (pin && !pcibios_lookup_irq(dev, 1) && !dev->irq) {
+ char *msg;
+ if (io_apic_assign_pci_irqs)
+ msg = " Probably buggy MP table.";
+ else if (pci_probe & PCI_BIOS_IRQ_SCAN)
+ msg = "";
+ else
+ msg = " Please try using pci=biosirq.";
+ printk(KERN_WARNING "PCI: No IRQ known for interrupt pin %c of device %s.%s\n",
+ 'A' + pin - 1, dev->slot_name, msg);
}
}
On Tue, 5 Dec 2000, Linus Torvalds wrote:
> WHY this happens is unclear, but it could be several reasons:
> - undocumented "Plug'n'Play OS true behaviour"
> - BIOS bugs. 'nuff said.
> - warm-booting from an OS that _does_ set the interrupt routing,
> and also sets the PCI config space thing
It's not the last reason here, i.e. it happens when cold booting into
Linux (and when warm booting from Linux into Linux). I could try warm
booting from Win98, however, if anybody cares. BTW: Win98 crashes while
booting when PnP OS is set to no (Linux works flawlessly, though).
> Adam, Kai, can you verify that it fixes the issues on your systems?
It works fine here - thanks again.
--Kai
On Tue, 5 Dec 2000, Linus Torvalds wrote:
> Anybody else who has had problems with PCI interrupt routing (tends to be
> "new" devices like CardBus, ACPI, USB etc), can you verify that this
> either fixes things or at least doesn't break a setup that had started
> working earlier..
problems with recent 2.4.0-test1* on my HP OmniBook 800 are probably
combined PCMCIA(CB) / PCI / APM issues. The point is my 16bit cards
(modem+ne2k) are working perfectly fine with yenta sockets until the first
suspend/resume. Afterwards the PCI config space of the Cardbus
bridge(s) is completely messed up forcing me to reboot.
So I just applied your patch vs. 2.4.0-t12p3 (had to cleanup one rejected
hunk due to an eisa_set_level_irq() which is already there).
pcmcia-cs is 3.1.22.
result: issue remains unchanged but nothing seems to be broken so far.
The only difference I've noticed is the following two lines appearing when
modprobing the pcmcia_core/yenta stuff:
IRQ for 00:04.0(0) via 00:04.0 -> PIRQ 01, mask 8eb8, \
excl 0000 -> newirq=9 ... failed
IRQ for 00:04.1(1) via 00:04.1 -> PIRQ 04, mask 8eb8, \
excl 0000 -> newirq=7 ... failed
My guess: might be due to the PCI-IRQ-router (VLSI 82C534 PCI-bridge,
id=1004:0102) without special support (defaults to r->get == NULL).
Furthermore, I've noticed at 2.4.0-t10 the PCI-IRQ's of the CB-bridges
were lost (reset to 0) during suspend/resume whereas at 2.4.0-t12p3 they
survive (-t11 not tried). However memory and io-mapping get lost.
I'd consider this the main reason for the failure, but I'm not sure
whether it's the Cardbus bridges' fault or a PCI or APM issue.
But nothing - neither fixed nor broken - has changed for me by this patch,
except for the two lines which apparently do not matter anyway.
attached: dmesg and lspci traces with some comments.
What more information/debugging would be helpful?
Regards
Martin
On Wed, 6 Dec 2000, Martin Diehl wrote:
>
> problems with recent 2.4.0-test1* on my HP OmniBook 800 are probably
> combined PCMCIA(CB) / PCI / APM issues. The point is my 16bit cards
> (modem+ne2k) are working perfectly fine with yenta sockets until the first
> suspend/resume. Afterwards the PCI config space of the Cardbus
> bridge(s) is completely messed up forcing me to reboot.
Ok. I actually knew of this issue, and it should be trivial to fix: we
should save and restore the 256-byte config space over suspends. CardBus
isn't the only controller that would need it.
Can you remind me in a day or two if I haven't gotten back to you? I don't
have any machines that need this, but I've seen ones that do, and if
you're willing to test..
> result: issue remains unchanged but nothing seems to be broken so far.
> The only difference I've noticed is the following two lines appearing when
> modprobing the pcmcia_core/yenta stuff:
>
> IRQ for 00:04.0(0) via 00:04.0 -> PIRQ 01, mask 8eb8, \
> excl 0000 -> newirq=9 ... failed
> IRQ for 00:04.1(1) via 00:04.1 -> PIRQ 04, mask 8eb8, \
> excl 0000 -> newirq=7 ... failed
Yes, this is expected for routers that we don't know about: we will still
use the irq that the device claims it has, but we will obviously fail to
try to route it (but it still works if the BIOS had already routed it -
which is how the old code always worked anyway).
Anyway, for the suspend-resume thing, if you want to go ahead on your own
without a real patch from me, the fix is along the lines of
- add a "u32 config_state[64]" array to pci_socket_t
- add two functions:
static void yenta_save_config(pci_socket_t *socket)
{
struct pci_dev *dev = socket->dev;
int i;
for (i = 0; i < 64; i++)
pci_read_config_dword(dev, i*4, socket->config_state+i);
}
static void yenta_restore_config(pci_socket_t *socket)
{
struct pci_dev *dev = socket->dev;
int i;
for (i = 0; i < 64; i++)
pci_write_config_dword(dev, i*4, socket->config_state[i]);
}
- do a "yenta_save_config()" in "yenta_suspend()" and a
"yenta_restore_config()" at the top of "yenta_resume()"
- test. Also test with the "pci_set_power_state(3)" in suspend enabled,
because it may/should actually work with that enabled too.
Any change in suspend/resume from the above?
Linus
On Wed, 6 Dec 2000, Linus Torvalds wrote:
> On Wed, 6 Dec 2000, Martin Diehl wrote:
> >
> [Cardbus config space lost after APM suspend/resume]
>
> Can you remind me in a day or two if I haven't gotten back to you? I don't
> have any machines that need this, but I've seen ones that do, and if
> you're willing to test..
sure, will to do testing (and reminding ;-)
> Yes, this is expected for routers that we don't know about: we will still
> use the irq that the device claims it has, but we will obviously fail to
> try to route it (but it still works if the BIOS had already routed it -
> which is how the old code always worked anyway).
btw, I'm thinking I could guess the routing from the VLSI config space,
but I don't have any doc's. Would it be worth to try to add some specific
get/set methods for this device? What about testers (or people who have
access to the docs)?
> Anyway, for the suspend-resume thing, if you want to go ahead on your own
> without a real patch from me, the fix is along the lines of
well, took me some time to follow all the paths thru cardbus/pcmcia stuff
wrt suspend/resume from pm - but ended up at:
> - add two functions:
>
> static void yenta_save_config(pci_socket_t *socket)
> static void yenta_restore_config(pci_socket_t *socket)
That's the crucial point, imho. The PCI layer forwards the PM events to
the cardbus-driver's suspend/resume methods, which are calling
pcmcia_suspend/resume_socket(). The latter in turn will call back the
appropriate yenta_operations which are registered to it. So much for sure.
However, there is no pcmcia_resume path forwarded to yenta since the
traditional pccard_operations did not provide such a method and pcmcia
simply re-initialized it's sockets. My assumption is, you haven't meant
to add a do-nothing resume to all the pcmcia-stuff (including i82365,
tcic) just to allow yenta to register for a resume operation which would
be there for cardbus only.
So my suggestion is to have cardbus_save/restore_config() exactly doing
what you've said for yenta_*.
> - do a "yenta_save_config()" in "yenta_suspend()" and a
> "yenta_restore_config()" at the top of "yenta_resume()"
yenta_resume() does not exist.
yenta_*() replaced by cardbus_*() as explained.
> - test. Also test with the "pci_set_power_state(3)" in suspend enabled,
> because it may/should actually work with that enabled too.
same point: pci_set_power_state(3) should go to cardbus_suspend(), not
yenta.
A first try of this ended oopsing at pm suspend somewhere below
pci_pm_*(). It turned out the reason was the cardbus_suspend() (and resume
too) method which was *invoked several times* in a row!
The reason for this is in drivers/pci.c where bridges are touched
twice: once as a device on a bus and once via ->self from the bus behind.
I'm not sure whether this is the intended behavior - but it definitely
calls cardbus_suspend/resume() twice which breaks when forwarding to
pcmcia_suspend/resume_socket().
So I've tentatively worked around using a "once" flag added to
pci_socket_t. This solves the problems during suspend/resume and the
cardbus' config space appears to be restored as intended - good.
The bad news however is, the sockets are still broken after
resume. Unfortunately there are several candidates I've spotted:
- calling yenta_init() stuff at resume - is this sufficient?
Probably we have to forward the pm-triggered resume from pm along
pci -> cardbus -> pcmcia -> yenta (last link currently missed,
because the pcmcia layer switches from incoming resume notification
to init path)
- some content of the mem/io regions might need to be preserved
- some TI1131 oddity wrt to CSC-INT's - requested IRQ's show up correctly
in /proc/interrupts and are properly triggered and handled at card
insert/eject. But after pm suspend/resume the box freezed when inserting
or ejecting the cards (no response to SysRq anymore).
I'll try to continue on this.
Regards
Martin
On Thu, 7 Dec 2000, Martin Diehl wrote:
>
> btw, I'm thinking I could guess the routing from the VLSI config space,
> but I don't have any doc's. Would it be worth to try to add some specific
> get/set methods for this device? What about testers (or people who have
> access to the docs)?
Please do. You might leave them commented out right now, but this is
actually how most of the pirq router entries have been created: by looking
at various pirq tables and matching them up with other (non-pirq)
documentation and testing. Th epirq "link" value is basically completely
NDA'd, and is per-chipset-specific. Nobody documents it except in their
bios writers guide, if even there.
> yenta_resume() does not exist.
> yenta_*() replaced by cardbus_*() as explained.
Yes.
> The reason for this is in drivers/pci.c where bridges are touched
> twice: once as a device on a bus and once via ->self from the bus behind.
> I'm not sure whether this is the intended behavior - but it definitely
> calls cardbus_suspend/resume() twice which breaks when forwarding to
> pcmcia_suspend/resume_socket().
Not intended behaviour. The self case should be removed.
> So I've tentatively worked around using a "once" flag added to
> pci_socket_t. This solves the problems during suspend/resume and the
> cardbus' config space appears to be restored as intended - good.
>
> The bad news however is, the sockets are still broken after
> resume. Unfortunately there are several candidates I've spotted:
>
> - calling yenta_init() stuff at resume - is this sufficient?
> Probably we have to forward the pm-triggered resume from pm along
> pci -> cardbus -> pcmcia -> yenta (last link currently missed,
> because the pcmcia layer switches from incoming resume notification
> to init path)
>
> - some content of the mem/io regions might need to be preserved
>
> - some TI1131 oddity wrt to CSC-INT's - requested IRQ's show up correctly
> in /proc/interrupts and are properly triggered and handled at card
> insert/eject. But after pm suspend/resume the box freezed when inserting
> or ejecting the cards (no response to SysRq anymore).
Ok, definitely needs some more work. Thanks for testing - I have no
hardware where this is needed.
Linus
On Thu, 7 Dec 2000, Linus Torvalds wrote:
> > btw, I'm thinking I could guess the routing from the VLSI config space,
>
> Please do. You might leave them commented out right now, but this is
> actually how most of the pirq router entries have been created: by looking
> at various pirq tables and matching them up with other (non-pirq)
> documentation and testing. Th epirq "link" value is basically completely
> NDA'd, and is per-chipset-specific. Nobody documents it except in their
> bios writers guide, if even there.
Ok - will do it. Unfortunately the BIOS of this notebook has no
customizeable routing option which I could use to to play with.
So testing here will hardly cover orthogonal cases.
> > The reason for this is in drivers/pci.c where bridges are touched
> > twice: once as a device on a bus and once via ->self from the bus behind.
>
> Not intended behaviour. The self case should be removed.
I was wondering whether there might be bridges which have to be awoken
from both sides because they have different config spaces there.
Is bus->children->self guarantied to be identical to bus->device for
all kinds of bridge devices?
Sure, dividing bridges wouldn't make too much sense - at least I don't see
what half a bridge might be good for, but ...
Removing self cases is straightforward - pci_pm-2.4.0-t9p3-patch below.
> Ok, definitely needs some more work. Thanks for testing - I have no
> hardware where this is needed.
Could do some more testing if a day or two for feedback is ok.
Two more things I've noticed:
- when all pcmcia/yenta stuff is in modules and doing suspend/resume
immediately after fresh cold reboot there is nothing our cardbus stuff
might have set up which was lost in suspend. Nevertheless, what happens is
the pcmcia_core/yenta_socket/ds modules get loaded without problem but the
"Socket status" printk from yenta_open_bh() is completely garbage. This is
not the case when the modules are loaded before the suspend.
Despite the garbage, subsequent cardmgr startup does not give any error
message - but the cards in the slots are not recognized (no beeps, no
status to retrieve from cardctl). Reboot is the only solution.
My conclusion is, the reason must be in the init-path doing or forgetting
something prohibited/required after suspend - or the TI1131 is broken.
- when, after yenta sockets became unusable due to pm suspend, I try to
eject/insert the cards from a slot, the box freezes. This turned out
to be a loop in yenta_interrupt being called endlessly. Apparently the
yenta_bh() -> pcmcia-handler path somehow triggers the next IRQ.
But this might be a consequence of the former issue.
According to the forecasts, next weekend will be rainy, so...
Thank you for the time!
Regards
Martin
-----
--- linux-2.4.0-t12p3/drivers/pci/pci.c.orig Mon Dec 4 14:21:26 2000
+++ linux-2.4.0-t12p3/drivers/pci/pci.c Fri Dec 8 00:17:50 2000
@@ -1089,6 +1089,9 @@
return 0;
}
+
+/* take care to suspend/resume bridges only once */
+
static int pci_pm_suspend_bus(struct pci_bus *bus)
{
struct list_head *list;
@@ -1100,9 +1103,6 @@
/* Walk the device children list */
list_for_each(list, &bus->devices)
pci_pm_suspend_device(pci_dev_b(list));
-
- /* Suspend the bus controller.. */
- pci_pm_suspend_device(bus->self);
return 0;
}
@@ -1110,8 +1110,6 @@
{
struct list_head *list;
- pci_pm_resume_device(bus->self);
-
/* Walk the device children list */
list_for_each(list, &bus->devices)
pci_pm_resume_device(pci_dev_b(list));
@@ -1125,18 +1123,26 @@
static int pci_pm_suspend(void)
{
struct list_head *list;
+ struct pci_bus *bus;
- list_for_each(list, &pci_root_buses)
- pci_pm_suspend_bus(pci_bus_b(list));
+ list_for_each(list, &pci_root_buses) {
+ bus = pci_bus_b(list);
+ pci_pm_suspend_bus(bus);
+ pci_pm_suspend_device(bus->self);
+ }
return 0;
}
static int pci_pm_resume(void)
{
struct list_head *list;
+ struct pci_bus *bus;
- list_for_each(list, &pci_root_buses)
- pci_pm_resume_bus(pci_bus_b(list));
+ list_for_each(list, &pci_root_buses) {
+ bus = pci_bus_b(list);
+ pci_pm_resume_device(bus->self);
+ pci_pm_resume_bus(bus);
+ }
return 0;
}
On Thu, 7 Dec 2000, Linus Torvalds wrote:
> > btw, I'm thinking I could guess the routing from the VLSI config space,
> > but I don't have any doc's. Would it be worth to try to add some specific
>
> Please do. You might leave them commented out right now, but this is
Ok. Apparently it's the "pirq is nibble index in config space" kind of
routing which makes guessing a change bios and lspci procedure.
patch vs. 2.4.0-t12p8 below. Tested as (in)complete as my bios permits.
Works fine for several days and correctly assigns IRQ's when unassigned
due to "pnp os". So I feel confident enough to not leave it commented out.
Test example attached.
Regards
Martin
-----
diff -Nur linux-2.4.0-t12p8/arch/i386/kernel/pci-irq.c linux-2.4.0-t12p8-md/arch/i386/kernel/pci-irq.c
--- linux-2.4.0-t12p8/arch/i386/kernel/pci-irq.c Mon Dec 11 00:29:42 2000
+++ linux-2.4.0-t12p8-md/arch/i386/kernel/pci-irq.c Mon Dec 11 00:58:48 2000
@@ -298,6 +298,33 @@
return 1;
}
+/*
+ * VLSI: nibble offset 0x74 - educated guess due to routing table and
+ * config space of VLSI 82C534 PCI-bridge/router (1004:0102)
+ * Tested on HP OmniBook 800 covering PIRQ 1, 2, 4, 8 for onboard
+ * devices, PIRQ 3 for non-pci(!) soundchip and (untested) PIRQ 6
+ * for the busbridge to the docking station.
+ */
+
+static int pirq_vlsi_get(struct pci_dev *router, struct pci_dev *dev, int pirq)
+{
+ if (pirq > 8) {
+ printk("VLSI router pirq escape (%d)\n", pirq);
+ return 0;
+ }
+ return read_config_nybble(router, 0x74, pirq-1);
+}
+
+static int pirq_vlsi_set(struct pci_dev *router, struct pci_dev *dev, int pirq, int irq)
+{
+ if (pirq > 8) {
+ printk("VLSI router pirq escape (%d)\n", pirq);
+ return 0;
+ }
+ write_config_nybble(router, 0x74, pirq-1, irq);
+ return 1;
+}
+
#ifdef CONFIG_PCI_BIOS
static int pirq_bios_set(struct pci_dev *router, struct pci_dev *dev, int pirq, int irq)
@@ -329,6 +356,7 @@
{ "NatSemi", PCI_VENDOR_ID_CYRIX, PCI_DEVICE_ID_CYRIX_5520, pirq_cyrix_get, pirq_cyrix_set },
{ "SIS", PCI_VENDOR_ID_SI, PCI_DEVICE_ID_SI_503, pirq_sis_get, pirq_sis_set },
+ { "VLSI 82C534", PCI_VENDOR_ID_VLSI, PCI_DEVICE_ID_VLSI_82C534, pirq_vlsi_get, pirq_vlsi_set },
{ "default", 0, 0, NULL, NULL }
};
Martin,
I finally got access to a machine that truly has multiple PCI buses and
bridges in between them, and at least for that machine the x86 IRQ lookup
does not work at all.
The problem seems to be the "pci_get_interrupt_pin()" call. We should not
do that. The pirq table has the unmodified device information - and when
we try to swizzle the pins and find the bridge that the device is behind,
we're trying to be way too clever.
Instead of doing
pin = pci_get_interrupt_pin(dev, &d);
if (pin < 0) {
DBG(" -> no interrupt pin\n");
return 0;
}
I think we should be doing:
u8 pin;
pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin);
if (!pin) {
DBG(" -> no interrupt pin\n");
return 0;
}
pin--;
d = dev;
and be done with it. No swizzling, no nothing.
On the machine I just saw, this would have given the right end result.
On machines with just one bus, we'd never see the difference.
Comments?
Linus
>The problem seems to be the "pci_get_interrupt_pin()" call. We should not
>do that. The pirq table has the unmodified device information - and when
>we try to swizzle the pins and find the bridge that the device is behind,
>we're trying to be way too clever.
Both with/without the change you mention, I still get routing warnings
on bootup. I've pasted the output from my Athlon box, as that seems
most affected by this. Interrupts 11 & 12 are left free, whilst it
routes multiple devices onto interrupt 5.
PCI: Found IRQ 5 for device 00:09.0
PCI: The same IRQ used for device 00:04.2
PCI: The same IRQ used for device 00:04.3
emu10k1: EMU10K1 rev 5 model 0x20 found, IO at 0xb800-0xb81f, IRQ 5
PCI: Found IRQ 5 for device 00:04.2
PCI: The same IRQ used for device 00:04.3
PCI: The same IRQ used for device 00:09.0
usb-uhci.c: USB UHCI at I/O 0xd400, IRQ 5
PCI: Found IRQ 5 for device 00:04.3
PCI: The same IRQ used for device 00:04.2
PCI: The same IRQ used for device 00:09.0
usb-uhci.c: USB UHCI at I/O 0xd000, IRQ 5
Interrupt routing table found at address 0xf0e90:
Version 1.0, size 0x0070
Interrupt router is device 00:04.0
PCI exclusive interrupt mask: 0x0000 []
Compatible router: vendor 0x1106 device 0x0686
Device 00:0c.0 (slot 1):
INTA: link 0x01, irq mask 0x1eb8 [3,4,5,7,9,10,11,12]
INTB: link 0x02, irq mask 0x1eb8 [3,4,5,7,9,10,11,12]
INTC: link 0x03, irq mask 0x1eb8 [3,4,5,7,9,10,11,12]
INTD: link 0x05, irq mask 0x1eb8 [3,4,5,7,9,10,11,12]
Device 00:0b.0 (slot 2): Ethernet controller
INTA: link 0x02, irq mask 0x1eb8 [3,4,5,7,9,10,11,12]
INTB: link 0x03, irq mask 0x1eb8 [3,4,5,7,9,10,11,12]
INTC: link 0x05, irq mask 0x1eb8 [3,4,5,7,9,10,11,12]
INTD: link 0x01, irq mask 0x1eb8 [3,4,5,7,9,10,11,12]
Device 00:09.0 (slot 4): Multimedia audio controller
INTA: link 0x05, irq mask 0x1eb8 [3,4,5,7,9,10,11,12]
INTB: link 0x01, irq mask 0x1eb8 [3,4,5,7,9,10,11,12]
INTC: link 0x02, irq mask 0x1eb8 [3,4,5,7,9,10,11,12]
INTD: link 0x03, irq mask 0x1eb8 [3,4,5,7,9,10,11,12]
Device 00:04.0 (slot 0): ISA bridge
INTA: link 0x01, irq mask 0x1eb8 [3,4,5,7,9,10,11,12]
INTB: link 0x02, irq mask 0x1eb8 [3,4,5,7,9,10,11,12]
INTC: link 0x03, irq mask 0x1eb8 [3,4,5,7,9,10,11,12]
INTD: link 0x05, irq mask 0x1eb8 [3,4,5,7,9,10,11,12]
Device 00:01.0 (slot 0): PCI bridge
INTA: link 0x01, irq mask 0x1eb8 [3,4,5,7,9,10,11,12]
INTB: link 0x02, irq mask 0x1eb8 [3,4,5,7,9,10,11,12]
INTC: link 0x03, irq mask 0x1eb8 [3,4,5,7,9,10,11,12]
INTD: link 0x05, irq mask 0x1eb8 [3,4,5,7,9,10,11,12]
Interrupt router at 00:04.0: VIA 82C686 PCI-to-ISA bridge
PIRQA (link 0x01): irq 11
PIRQB (link 0x02): irq 10
PIRQC (link 0x03): unrouted
PIRQD (link 0x05): irq 5
regards,
Davej.
--
| Dave Jones <[email protected]> http://www.suse.de/~davej
| SuSE Labs
On Thu, 7 Dec 2000, Linus Torvalds wrote:
> Ok, definitely needs some more work. Thanks for testing - I have no
> hardware where this is needed.
Well, so I've tried to go on since my box has this "feature". Seems I
finally got the thing tracked down to several issues with mutual influence
and thus really hard to reproduce.
Apparently there are not (yet?) so much people hurt by it so my believe is
the required cleanup would be post-2.4.0(final). Just want to give you
some idea of the interesting things I've seen so far:
1) cardbus_resume() gets invoked more than once, even at test12, where the
"bridges hit twice" case from pci_pm stuff is fixed.
The reason is pcmcia_core stuff sleeping in the resume path waiting for
card detection. So we are scheduled with the resume still pending and
cardbus_resume() is entered again from kapm-idled (first was from userland
apmd context). So we get screwed when doing a second init while already
waiting for the card to finish interrogation.
My solution: asynch semantics for cardbus_resume() wrt to pcmcia_core
using a scheduled resume_bh. So we finish the pm callback before
sleeping.
2) While 1) prevents us from fooling ourselves there might be other
drivers sleeping in resume. According to Documentation/pm.txt it is legal
to do so. Probably, instead of speaking of some unspecified advantage to
finish fast, it should be stated as strongly discouraged to sleep.
Otherwise one single driver could trigger the multiple resume case for all
others. Anyway, best solution might be a clean state machine which handles
the pm transitions (and pci hotplugging). IMHO this is 2.5 stuff so I've
tried to protect the yenta stuff by its own (lockable) state flag.
3) The TI1131 is apparently not PCI PM 1.0 compliant. At least it seems it
has been replaced by the 12xx series at the moment some major player
required PCI PM 1.0 to get his "Designed for ..." label in '98 ;-)
So I had to add some code to save and restore things like memory and io
windows of the bridge which were lost after resume. This is implemented as
a controller specific addon to the common yenta operations similar to the
open/init case.
4) The final bang was when I realized that after all that done the
content of the CardBus/ExCA register space was total garbage after
resume. And, even worse, it completely failed to restore - not even
0's written to it could be read back as such. This turned out to be a
io-mapping issue! Believe it or not - my solution is to disable the
cardbus controller in BIOS setup. The rationale is as follows:
- When controller is enabled the BIOS assigns BASE_0 to 0xe6000/0xe7000.
This is mapped to 0xc00e6000 by ioremap(). Everything works fine until
we suspend. Furthermore I've proved by use of virt_to_bus() and vice
versa the mapping is still there after resume. However the content is
not writeable anymore and contains some arbitrary garbage - which always
stays the same, even over cold reboot. But no Oops or so - just if
you were writing to /dev/null and reading some hardwired bytes.
Even unmapping it at suspend and remapping after resume did not help.
- With controller disabled on the other hand the BIOS does not assign
BASE_0. So we do it during pciscan (btw., that's why I needed the VLSI
router stuff first, since the IRQ is unrouted too in this case). This
assigns bus-address like 0x10000000 to the guy which we are mapping
to 0xc3-somewhere - fine. This mapping however does not only survive the
suspend/resume like the first one, its content also remains valid -
i.e. no garbage and writeable - here we go :)
Well, at the end yenta is now working together with pm if 1-4) applied.
So I would stop here with this workaround for me and things to be
addressed later at 2.5. Of course I could prepare 2 or 3 patches in case
it might be helpful at pre-2.4. All changes are to yenta_socket only, so
it would at least not break anything else.
However, I don't see what makes bus address 0xe6000 differ from 0x10000000
- except we are crossing the 1M barrier.
>From the i386/ioremap() code I've seen the 640k-1M range is handled
separately since it's always mapped. Some chance to loose something here
during suspend? Pagetables/-caches are expected to remain valid - right?
Btw, all access to the cardbus/exca registers go to the inlines at the top
of yenta.c using read[bwl]() - which is (for the i386) defined to simply
dereference __io_virt(addr). But we have addr pointing somewhere to the
cardbus registers already memory mapped, so we could simply say *addr.
Just a minor notational inconsistency or is there good reason to access
iomem one way or the other (aliasing, caching,...)?
Regards
Martin
On Fri, 15 Dec 2000, Martin Diehl wrote:
>
> 3) The TI1131 is apparently not PCI PM 1.0 compliant. At least it seems it
> has been replaced by the 12xx series at the moment some major player
> required PCI PM 1.0 to get his "Designed for ..." label in '98 ;-)
> So I had to add some code to save and restore things like memory and io
> windows of the bridge which were lost after resume. This is implemented as
> a controller specific addon to the common yenta operations similar to the
> open/init case.
Fair enough.
> 4) The final bang was when I realized that after all that done the
> content of the CardBus/ExCA register space was total garbage after
> resume. And, even worse, it completely failed to restore - not even
> 0's written to it could be read back as such. This turned out to be a
> io-mapping issue! Believe it or not - my solution is to disable the
> cardbus controller in BIOS setup. The rationale is as follows:
>
> - When controller is enabled the BIOS assigns BASE_0 to 0xe6000/0xe7000.
> This is mapped to 0xc00e6000 by ioremap(). Everything works fine until
> we suspend. Furthermore I've proved by use of virt_to_bus() and vice
> versa the mapping is still there after resume. However the content is
> not writeable anymore and contains some arbitrary garbage - which always
> stays the same, even over cold reboot. But no Oops or so - just if
> you were writing to /dev/null and reading some hardwired bytes.
> Even unmapping it at suspend and remapping after resume did not help.
The ioremap() mappings will definitely still be there - those are kernel
data structures, and the suspend/resume won't do anything to them.
I'm surprised: "yenta_init()" will re-initialize the yenta
PCI_BASE_ADDRESS_0 register, but maybe there's something wrong there. Try
adding a pci_enable_device() to turn the device on and also re-route the
interrupts if necessary.
The above is fairly strange, though. I wonder if the problem is that
0xe6000 value: that's a pretty bogus address for a PCI window, as it's in
the BIOS legacy area.
I suspect that the suspend/resume will do something bad to the BAT
registers, which control the BIOS area mapping behaviour, and it just
kills the forwarding of the legacy region to the PCI bus, or something.
I wonder if the PCI cardbus init code should just notice this, and force
all cardbus windows to be re-initialized. That legacy area address really
doesn't look right.
Linus
On Fri, 15 Dec 2000, Linus Torvalds wrote:
> I'm surprised: "yenta_init()" will re-initialize the yenta
> PCI_BASE_ADDRESS_0 register, but maybe there's something wrong there. Try
right - but it is just writing back the bogus 0xe6000 thing.
> adding a pci_enable_device() to turn the device on and also re-route the
> interrupts if necessary.
Tried: nothing changed. For the TI1131 only the bridge windows are lost,
not resource 0. It's still there and appears valid to the kernels best
knowledge - no need for re-negotiation.
> The above is fairly strange, though. I wonder if the problem is that
> 0xe6000 value: that's a pretty bogus address for a PCI window, as it's in
> the BIOS legacy area.
I've just hacked down the pci resource allocation (namely pci-i386.c) in
such a way to always assign insane 0xe6000/0xe7000 to the base resource 0
(register memory) similar to what the BIOS does. Same result: working
until suspend, identical RO garbage thereafter. Seems it's really a bad
choice to map PCI memory to this area - at least for this box.
> I suspect that the suspend/resume will do something bad to the BAT
> registers, which control the BIOS area mapping behaviour, and it just
> kills the forwarding of the legacy region to the PCI bus, or something.
sounds reasonable wrt what I've seen - Don't trust the BIOS.
> I wonder if the PCI cardbus init code should just notice this, and force
> all cardbus windows to be re-initialized. That legacy area address really
> doesn't look right.
Should work - identify all (bad mapped) regions, free them, and let
pci_enable_device() make things fine. However, I'd suggest doing this at
initial pci device scan since
- not only cardbus devices might be misconfigured by the BIOS
- no need for sanity check in every pci-capable driver.
- similar stuff needed for transparent hotplugging
Loosing part of this later at suspend is a different issue which may
deserve fixing at per-driver basis. But broken PCI memory mapping to BIOS
legacy area should be corrected from the very beginning, I believe.
Regards
Martin
-----
FYI - /proc/iomem showing broken iomapping from BIOS
(the 53c810 might better be page-adjusted too):
00000000-0009fbff : System RAM
0009fc00-0009ffff : reserved
000a0000-000bffff : Video RAM area
000c0000-000c7fff : Video ROM
000e6000-000e6fff : Texas Instruments PCI1131
000e7000-000e7fff : Texas Instruments PCI1131 (#2)
000f0000-000fffff : System ROM
00100000-02ffffff : System RAM
00100000-0020afaf : Kernel code
0020afb0-0021f7a3 : Kernel data
10000000-103fffff : PCI CardBus #80
10400000-107fffff : PCI CardBus #80
10800000-10bfffff : PCI CardBus #81
10c00000-10ffffff : PCI CardBus #81
1fffff00-1fffffff : Symbios Logic Inc. (formerly NCR) 53c810
20000000-2fffffff : PCI Bus #01
30000000-3fffffff : PCI Bus #01
c0000000-c03fffff : Neomagic Corporation NM2093 [MagicGraph 128ZV]
c0000000-c010ffff : vesafb
fff00000-ffffffff : reserved
On Fri, 15 Dec 2000, Linus Torvalds wrote:
> I suspect that the suspend/resume will do something bad to the BAT
> registers, which control the BIOS area mapping behaviour, and it just
> kills the forwarding of the legacy region to the PCI bus, or something.
FYI: I've identified a single byte in the hostbridges config space which
is altered after resume. Blindly restoring it makes the 0xe6000 pci bus
address mapping accessible again. But I think that's not the Right way to
fix it.
> I wonder if the PCI cardbus init code should just notice this, and force
> all cardbus windows to be re-initialized. That legacy area address really
> doesn't look right.
That's what I've done now. So I'm sending the modification I made in case
you would still like them for 2.4.0. I've separated it in 2 (almost
independent) patches meant to be applied in series against 2.4.0-t12 (final)
This is part 1. It provides:
- pm_state, pm_lock for cardbus socket to prevent multiple suspend/resume,
especially the reentrant case due to some driver sleeping in resume.
- cardbus_change_pm_state() to handle pm state transition. Suspend is
handled synchronously but resume schedules an asynchronous completion
handler since pcmcia_resume_socket() may sleep.
- placing the pci_set_power_state() calls at the right place
Mainly touching pci_socket.* it fixes the suspend/resume multiple entrance
issues due to some driver (notably cardbus itself) sleeping in resume.
Regards
Martin
-----
diff -Nur v2.4.0-test12/drivers/pcmcia/pci_socket.c v2.4.0-t12-yenta1/driver/pcmcia/pci_socket.c
--- v2.4.0-test12/drivers/pcmcia/pci_socket.c Wed Nov 29 21:47:10 2000
+++ v2.4.0-t12-yenta1/driver/pcmcia/pci_socket.c Sun Dec 17 19:59:27 2000
@@ -178,6 +178,8 @@
socket->op = ops;
dev->driver_data = socket;
spin_lock_init(&socket->event_lock);
+ socket->pm_state = 0;
+ spin_lock_init(&socket->pm_lock);
return socket->op->open(socket);
}
@@ -212,16 +214,83 @@
dev->driver_data = 0;
}
+/* Delayed handler scheduled to complete the D3->D0 transition in the
+ * upper layers. We may sleep in pcmcia_resume_socket() with pm_lock
+ * hold - so we are save from resume re-entry due to other drivers
+ * sleeping in pci_pm resume handling.
+ */
+
+static void cardbus_resume_bh(void *data)
+{
+ pci_socket_t *socket = (pci_socket_t *)data;
+
+ pcmcia_resume_socket(socket->pcmcia_socket);
+ socket->pm_state = 0;
+ spin_unlock(&socket->pm_lock);
+ MOD_DEC_USE_COUNT;
+}
+
+/* We are forced to implement asynch resume semantics because the
+ * pcmcia_resume path sleeps and we might get screwed by a second
+ * pci_pm_resume_device() hitting us in the middle of the first one.
+ * Which might happen anyway, if other drivers do not cooperate!
+ * So it's good to know we are protected by our socket->pm_lock.
+ */
+
+static void cardbus_change_pm_state(pci_socket_t *socket, int newstate)
+{
+ switch (newstate) {
+ case 3:
+ pcmcia_suspend_socket(socket->pcmcia_socket);
+ break;
+
+ case 0:
+ socket->tq_resume.routine = cardbus_resume_bh;
+ socket->tq_resume.data = socket;
+ MOD_INC_USE_COUNT;
+ schedule_task(&socket->tq_resume);
+ break;
+ default:
+ printk("cardbus: undefined power state\n");
+ break;
+ }
+}
+
+
static void cardbus_suspend (struct pci_dev *dev)
{
pci_socket_t *socket = (pci_socket_t *) dev->driver_data;
- pcmcia_suspend_socket (socket->pcmcia_socket);
+
+ spin_lock(&socket->pm_lock);
+ if (socket->pm_state != 0) {
+ spin_unlock(&socket->pm_lock);
+ printk("cardbus: suspend of already suspended socket blocked\n");
+ return;
+ }
+ cardbus_change_pm_state(socket,3);
+ pci_set_power_state(dev,3);
+ socket->pm_state = 3;
+ spin_unlock(&socket->pm_lock);
}
static void cardbus_resume (struct pci_dev *dev)
{
pci_socket_t *socket = (pci_socket_t *) dev->driver_data;
- pcmcia_resume_socket (socket->pcmcia_socket);
+
+ spin_lock(&socket->pm_lock);
+ if (socket->pm_state != 3) {
+ spin_unlock(&socket->pm_lock);
+ printk("cardbus: resume of non-suspended socket blocked\n");
+ return;
+ }
+ pci_set_power_state(dev,0);
+ cardbus_change_pm_state(socket, 0);
+
+ /* we intentionally leave with socket->pm_state not updated
+ * and socket->pm_lock still acquired!
+ * Will be released by the pending cardbus_resume_bh()
+ * Needed to protect against resume re-entry.
+ */
}
diff -Nur v2.4.0-test12/drivers/pcmcia/pci_socket.h v2.4.0-t12-yenta1/driver/pcmcia/pci_socket.h
--- v2.4.0-test12/drivers/pcmcia/pci_socket.h Wed Nov 29 21:47:10 2000
+++ v2.4.0-t12-yenta1/driver/pcmcia/pci_socket.h Sun Dec 17 20:04:32 2000
@@ -26,6 +26,14 @@
/* A few words of private data for the low-level driver.. */
unsigned int private[8];
+
+ /* used for delayed resume completion handler */
+ struct tq_struct tq_resume;
+
+ /* to protect our pm state management */
+ unsigned int pm_state;
+ spinlock_t pm_lock;
+
} pci_socket_t;
struct pci_socket_ops {
diff -Nur v2.4.0-test12/drivers/pcmcia/yenta.c v2.4.0-t12-yenta1/driver/pcmcia/yenta.c
--- v2.4.0-test12/drivers/pcmcia/yenta.c Wed Nov 29 21:47:10 2000
+++ v2.4.0-t12-yenta1/driver/pcmcia/yenta.c Sun Dec 17 20:00:17 2000
@@ -629,8 +629,6 @@
u16 bridge;
struct pci_dev *dev = socket->dev;
- pci_set_power_state(socket->dev, 0);
-
config_writel(socket, CB_LEGACY_MODE_BASE, 0);
config_writel(socket, PCI_BASE_ADDRESS_0, dev->resource[0].start);
config_writew(socket, PCI_COMMAND,
@@ -687,8 +685,6 @@
* the IO and MEM bridging region data.. That is
* something that pci_set_power_state() should
* probably know about bridges anyway.
- *
- pci_set_power_state(socket->dev, 3);
*/
return 0;
This is part 2 of the yenta+pm updates for 2.4.0-t12 - to be applied after
part 1. Touching yenta.c only it provides:
- yenta_validate_base() to check and try to fix in case the BIOS has
mapped the cardbus base registers to the legacy area <1M.
IMHO, this would be better placed in the early pci initialization,
but I prefered keeping all changes inside yenta_socket at pre-2.4.0.
- writing back the cardbus bridges' memory and io windows at resume in
case they were lost. As it turned out this is the only additional thing
to do for the TI1131, I've put that in the general yenta_config_init()
as there might be other controllers with the same problem and it should
not hurt anyway.
- adding pci_set_master() to yenta_open().
This should fix problems caused by bad BIOS configuration or
configuration loss at suspend.
Testing: Both parts together on HP Omnibook 800: Regardless what the BIOS
has done (Cardbus enable/disable, PnP OS yes/no), bad things are detected
and fixed and yenta is now working with pm, even when suspending
immediately after boot and modprobing the yenta stuff later - i.e. we do
not depend on saving something to restore before we suspend for the first
time. Tested with 16bit modem and ne2k card.
Thank you for pointing me in the right direction.
Regards
Martin
-----
--- v2.4.0-t12-yenta1/driver/pcmcia/yenta.c Sun Dec 17 20:00:17 2000
+++ v2.4.0-t12-yenta2/driver/pcmcia/yenta.c Mon Dec 18 11:50:42 2000
@@ -623,14 +623,24 @@
/*
* Initialize the standard cardbus registers
+ * and write back bridge windows in case controller forgot it.
*/
static void yenta_config_init(pci_socket_t *socket)
{
u16 bridge;
struct pci_dev *dev = socket->dev;
+ struct resource *res;
+ u32 offset;
+ unsigned i;
config_writel(socket, CB_LEGACY_MODE_BASE, 0);
config_writel(socket, PCI_BASE_ADDRESS_0, dev->resource[0].start);
+ for (i = 0; i < 4; i++) {
+ res = socket->dev->resource + PCI_BRIDGE_RESOURCES + i;
+ offset = PCI_CB_MEMORY_BASE_0 + 8 * i;
+ config_writel(socket, offset, res->start);
+ config_writel(socket, offset+4, res->end);
+ }
config_writew(socket, PCI_COMMAND,
PCI_COMMAND_IO |
PCI_COMMAND_MEMORY |
@@ -676,17 +686,6 @@
static int yenta_suspend(pci_socket_t *socket)
{
yenta_set_socket(socket, &dead_socket);
-
- /*
- * This does not work currently. The controller
- * loses too much informationduring D3 to come up
- * cleanly. We should probably fix yenta_init()
- * to update all the critical registers, notably
- * the IO and MEM bridging region data.. That is
- * something that pci_set_power_state() should
- * probably know about bridges anyway.
- */
-
return 0;
}
@@ -796,11 +795,66 @@
#define NR_OVERRIDES (sizeof(cardbus_override)/sizeof(struct cardbus_override_struct))
+/* BIOS might have mapped devices' base resource to a bogus memory area
+ * and - even worse - the hostbridge looses this window during suspend.
+ * So we try to detect and fix this by re-assigning the resource if we
+ * find the base resource mapped to legacy area <1M. But we don't try
+ * this, if the obsolete MEM_TYPE_1M flag is set, just in case...
+ */
+
+static void yenta_validate_base(struct pci_dev *dev)
+{
+ struct resource *res;
+ u32 temp, size;
+
+ res = &dev->resource[0];
+ if (!res || !(res->flags&IORESOURCE_MEM) || res->start>=0x00100000)
+ return;
+
+ pci_set_power_state(dev,0);
+ pci_read_config_dword(dev, PCI_BASE_ADDRESS_0, &temp);
+ if (temp & PCI_BASE_ADDRESS_MEM_TYPE_1M) {
+ printk("yenta: found pci memory mapped to <1M legacy area\n");
+ printk("yenta: not touched since (obsolete) 1M type set\n");
+ return;
+ }
+ printk("yenta: fixing bogus pci memory mapping %08lx\n", res->start);
+
+ pci_write_config_dword(dev, PCI_BASE_ADDRESS_0, ~0x0);
+ pci_read_config_dword(dev, PCI_BASE_ADDRESS_0, &temp);
+ if (temp & PCI_BASE_ADDRESS_SPACE_IO) {
+ printk("yenta: pci memory mutated to io - giving up!\n");
+ pci_write_config_dword(dev, PCI_BASE_ADDRESS_0, res->start);
+ return;
+ }
+
+ /* semi-optimal - old values lost if failure (pretty unlikely).
+ * save & restore would require access to the resource list lock,
+ * which is private in kernel/resource.c
+ */
+
+ release_resource(res);
+
+ size = ~(u32)(temp & PCI_BASE_ADDRESS_MEM_MASK);
+ res->name = dev->name;
+ res->start = 0;
+ res->end = res->start + size;
+ res->flags = IORESOURCE_MEM;
+ if (temp & PCI_BASE_ADDRESS_MEM_PREFETCH)
+ res->flags |= IORESOURCE_PREFETCH;
+ res->parent = res->child = res->sibling = NULL;
+ pci_assign_resource(dev,0);
+}
+
/*
* Initialize a cardbus controller. Make sure we have a usable
* interrupt, and that we can map the cardbus area. Fill in the
* socket information structure..
+ * Validating the BIOS-provided base resource mapping before the device
+ * is enabled helps to resolve conflicts with the cardbus bridge windows,
+ * which might already been set but not yet requested.
*/
+
static int yenta_open(pci_socket_t *socket)
{
int i;
@@ -809,12 +863,14 @@
/*
* Do some basic sanity checking..
*/
+ yenta_validate_base(dev);
if (pci_enable_device(dev))
return -1;
if (!pci_resource_start(dev, 0)) {
printk("No cardbus resource!\n");
return -1;
}
+ pci_set_master(dev);
/*
* Ok, start setup.. Map the cardbus registers,