2005-10-06 23:20:36

by linas

[permalink] [raw]
Subject: [PATCH 0/22] ppc64: Full sequence of PCI Error recovery patches


[PATCH 0/22] ppc64: Full sequence of PCI Error recovery patches

The following sequence of patches implement the full set of
PCI error recovery functions for ppc64. There are a large
numer of patches because I've attempted to keep the scope
of each patch reasonably small, and thus easy to review.
(The system should remain usable and functional after applying
each patch).

A detailed explanation of what this is and how it works is
in patch 6/22; if you don't already know what this is about,
that would be the place to start reading.

These patches result in systems that have survived multi-hour
runs with thousands of PCI errors injected. Although this is
good, I still can't warrent that this is bug-free, as there
are still hardware combos that haven't been tested. But for
now, it seems to work.

Signed-off-by: Linas Vepstas <[email protected]>


2005-10-06 23:23:23

by linas

[permalink] [raw]
Subject: [PATCH 1/22] ppc64: Dynamic LPAR bugfix


01-hotplug-bugfix.patch

In the current 2.6.14-rc2-git6 kernel, performing a Dynamic LPAR Add
of a hotplug slot will crash the system, with the following (abbreviated)
stack trace:

cpu 0x3: Vector: 700 (Program Check) at [c000000053dff7f0]
pc: c0000000004f5974: .__alloc_bootmem+0x0/0xb0
lr: c0000000000258a0: .update_dn_pci_info+0x108/0x118
c0000000000257c8 .update_dn_pci_info+0x30/0x118 (unreliable)
c0000000000258fc .pci_dn_reconfig_notifier+0x4c/0x64
c000000000060754 .notifier_call_chain+0x68/0x9c

The root cause was that __init __alloc_bootmem() was called long after
boot had finished, resulting in a crash because this routine is undefined
after boot time. The patch below fixes this crash, and adds some docs to
clarify the code.

p.s. congrats to all for getting slashdotted on this yesterday!

Signed-off-by: Linas Vepstas <[email protected]>

Index: linux-2.6.14-rc2-git6/arch/ppc64/kernel/pci_dn.c
===================================================================
--- linux-2.6.14-rc2-git6.orig/arch/ppc64/kernel/pci_dn.c 2005-10-03 13:45:58.000000000 -0500
+++ linux-2.6.14-rc2-git6/arch/ppc64/kernel/pci_dn.c 2005-10-04 15:37:49.761245845 -0500
@@ -44,7 +44,7 @@
u32 *regs;
struct pci_dn *pdn;

- if (phb->is_dynamic)
+ if (mem_init_done)
pdn = kmalloc(sizeof(*pdn), GFP_KERNEL);
else
pdn = alloc_bootmem(sizeof(*pdn));
@@ -121,6 +121,14 @@
return NULL;
}

+/**
+ * pci_devs_phb_init_dynamic - setup pci devices under this PHB
+ * phb: pci-to-host bridge (top-level bridge connecting to cpu)
+ *
+ * This routine is called both during boot, (before the memory
+ * subsystem is set up, before kmalloc is valid) and during the
+ * dynamic lpar operation of adding a PHB to a running system.
+ */
void __devinit pci_devs_phb_init_dynamic(struct pci_controller *phb)
{
struct device_node * dn = (struct device_node *) phb->arch_data;
@@ -201,9 +209,14 @@
.notifier_call = pci_dn_reconfig_notifier,
};

-/*
- * Actually initialize the phbs.
- * The buswalk on this phb has not happened yet.
+/**
+ * pci_devs_phb_init - Initialize phbs and pci devs under them.
+ *
+ * This routine walks over all phb's (pci-host bridges) on the
+ * system, and sets up assorted pci-related structures
+ * (including pci info in the device node structs) for each
+ * pci device found underneath. This routine runs once,
+ * early in the boot sequence.
*/
void __init pci_devs_phb_init(void)
{

2005-10-06 23:25:09

by linas

[permalink] [raw]
Subject: [PATCH 2/22] ppc64: Enable detection bugfix


02-EEH-enable-bugfix.patch

Bugfix: With the curent linux-2.6.14-rc2-git6, EEH errors are
ignored because thier detection requires an unusued, uninitialized
flag to be set. This patch removes the unused flag.

Signed-off-by: Linas Vepstas <[email protected]>


Index: linux-2.6.14-rc2-git6/arch/ppc64/kernel/eeh.c
===================================================================
--- linux-2.6.14-rc2-git6.orig/arch/ppc64/kernel/eeh.c 2005-10-04 15:32:17.844809875 -0500
+++ linux-2.6.14-rc2-git6/arch/ppc64/kernel/eeh.c 2005-10-04 15:54:21.769066567 -0500
@@ -631,11 +631,12 @@
pdn = PCI_DN(dn);

/* Access to IO BARs might get this far and still not want checking. */
- if (!pdn->eeh_capable || !(pdn->eeh_mode & EEH_MODE_SUPPORTED) ||
+ if (!(pdn->eeh_mode & EEH_MODE_SUPPORTED) ||
pdn->eeh_mode & EEH_MODE_NOCHECK) {
__get_cpu_var(ignored_check)++;
#ifdef DEBUG
- printk ("EEH:ignored check for %s %s\n", pci_name (dev), dn->full_name);
+ printk ("EEH:ignored check (%x) for %s %s\n",
+ pdn->eeh_mode, pci_name (dev), dn->full_name);
#endif
return 0;
}
Index: linux-2.6.14-rc2-git6/include/asm-ppc64/pci-bridge.h
===================================================================
--- linux-2.6.14-rc2-git6.orig/include/asm-ppc64/pci-bridge.h 2005-10-04 15:32:17.845809735 -0500
+++ linux-2.6.14-rc2-git6/include/asm-ppc64/pci-bridge.h 2005-10-04 15:54:21.769066567 -0500
@@ -61,7 +61,6 @@
int devfn; /* for pci devices */
int eeh_mode; /* See eeh.h for possible EEH_MODEs */
int eeh_config_addr;
- int eeh_capable; /* from firmware */
int eeh_check_count; /* # times driver ignored error */
int eeh_freeze_count; /* # times this device froze up. */
int eeh_is_bridge; /* device is pci-to-pci bridge */

2005-10-06 23:26:32

by linas

[permalink] [raw]
Subject: [PATCH 3/22] ppc64: EEH Recovery dispatcher thread


03-eeh-event-dispatcher.patch

ppc64: EEH Recovery dispatcher thread

This patch adds a mechanism to create recovery threads when an
EEH event is received. Since an EEH freeze state may be detected
within an interrupt context, we need to get out of the interrupt
context before starting recovery. This dispatcher does this in
two steps: first, it uses a workqueue to get out, and then
lanuches a kernel thread, so that the recovery routine can
sleep for exteded periods without upseting the keventd.

A kernel thread is created with each EEH event, rather than
having one long-running daemon started at boot time. This is
because it is anticipated that EEH events will be very rare
(very very rare, ideally) and so its pointless to cluter the
process tables with a daemon that will almost never run.


Signed-off-by: Linas Vepstas <[email protected]>


Index: linux-2.6.14-rc2-git6/arch/ppc64/kernel/Makefile
===================================================================
--- linux-2.6.14-rc2-git6.orig/arch/ppc64/kernel/Makefile 2005-10-04 15:32:13.000000000 -0500
+++ linux-2.6.14-rc2-git6/arch/ppc64/kernel/Makefile 2005-10-06 17:50:25.365604176 -0500
@@ -37,7 +37,7 @@
bpa_iic.o spider-pic.o

obj-$(CONFIG_KEXEC) += machine_kexec.o
-obj-$(CONFIG_EEH) += eeh.o
+obj-$(CONFIG_EEH) += eeh.o eeh_event.o
obj-$(CONFIG_PROC_FS) += proc_ppc64.o
obj-$(CONFIG_RTAS_FLASH) += rtas_flash.o
obj-$(CONFIG_SMP) += smp.o
Index: linux-2.6.14-rc2-git6/arch/ppc64/kernel/eeh.c
===================================================================
--- linux-2.6.14-rc2-git6.orig/arch/ppc64/kernel/eeh.c 2005-10-04 15:54:21.000000000 -0500
+++ linux-2.6.14-rc2-git6/arch/ppc64/kernel/eeh.c 2005-10-06 17:50:31.847694674 -0500
@@ -19,7 +19,6 @@

#include <linux/init.h>
#include <linux/list.h>
-#include <linux/notifier.h>
#include <linux/pci.h>
#include <linux/proc_fs.h>
#include <linux/rbtree.h>
@@ -33,6 +32,7 @@
#include <asm/atomic.h>
#include <asm/systemcfg.h>
#include "pci.h"
+#include "eeh_event.h"

#undef DEBUG

@@ -70,14 +70,6 @@
* and sent out for processing.
*/

-/* EEH event workqueue setup. */
-static DEFINE_SPINLOCK(eeh_eventlist_lock);
-LIST_HEAD(eeh_eventlist);
-static void eeh_event_handler(void *);
-DECLARE_WORK(eeh_event_wq, eeh_event_handler, NULL);
-
-static struct notifier_block *eeh_notifier_chain;
-
/* If a device driver keeps reading an MMIO register in an interrupt
* handler after a slot isolation event has occurred, we assume it
* is broken and panic. This sets the threshold for how many read
@@ -421,24 +413,6 @@
}

/**
- * eeh_register_notifier - Register to find out about EEH events.
- * @nb: notifier block to callback on events
- */
-int eeh_register_notifier(struct notifier_block *nb)
-{
- return notifier_chain_register(&eeh_notifier_chain, nb);
-}
-
-/**
- * eeh_unregister_notifier - Unregister to an EEH event notifier.
- * @nb: notifier block to callback on events
- */
-int eeh_unregister_notifier(struct notifier_block *nb)
-{
- return notifier_chain_unregister(&eeh_notifier_chain, nb);
-}
-
-/**
* read_slot_reset_state - Read the reset state of a device node's slot
* @dn: device node to read
* @rets: array to return results in
@@ -461,73 +435,6 @@
}

/**
- * eeh_panic - call panic() for an eeh event that cannot be handled.
- * The philosophy of this routine is that it is better to panic and
- * halt the OS than it is to risk possible data corruption by
- * oblivious device drivers that don't know better.
- *
- * @dev pci device that had an eeh event
- * @reset_state current reset state of the device slot
- */
-static void eeh_panic(struct pci_dev *dev, int reset_state)
-{
- /*
- * XXX We should create a separate sysctl for this.
- *
- * Since the panic_on_oops sysctl is used to halt the system
- * in light of potential corruption, we can use it here.
- */
- if (panic_on_oops) {
- struct device_node *dn = pci_device_to_OF_node(dev);
- eeh_slot_error_detail (PCI_DN(dn), 2 /* Permanent Error */);
- panic("EEH: MMIO failure (%d) on device:%s\n", reset_state,
- pci_name(dev));
- }
- else {
- __get_cpu_var(ignored_failures)++;
- printk(KERN_INFO "EEH: Ignored MMIO failure (%d) on device:%s\n",
- reset_state, pci_name(dev));
- }
-}
-
-/**
- * eeh_event_handler - dispatch EEH events. The detection of a frozen
- * slot can occur inside an interrupt, where it can be hard to do
- * anything about it. The goal of this routine is to pull these
- * detection events out of the context of the interrupt handler, and
- * re-dispatch them for processing at a later time in a normal context.
- *
- * @dummy - unused
- */
-static void eeh_event_handler(void *dummy)
-{
- unsigned long flags;
- struct eeh_event *event;
-
- while (1) {
- spin_lock_irqsave(&eeh_eventlist_lock, flags);
- event = NULL;
- if (!list_empty(&eeh_eventlist)) {
- event = list_entry(eeh_eventlist.next, struct eeh_event, list);
- list_del(&event->list);
- }
- spin_unlock_irqrestore(&eeh_eventlist_lock, flags);
- if (event == NULL)
- break;
-
- printk(KERN_INFO "EEH: MMIO failure (%d), notifiying device "
- "%s\n", event->reset_state,
- pci_name(event->dev));
-
- notifier_call_chain (&eeh_notifier_chain,
- EEH_NOTIFY_FREEZE, event);
-
- pci_dev_put(event->dev);
- kfree(event);
- }
-}
-
-/**
* eeh_token_to_phys - convert EEH address token to phys address
* @token i/o token, should be address in the form 0xA....
*/
@@ -613,8 +520,6 @@
int ret;
int rets[3];
unsigned long flags;
- int reset_state;
- struct eeh_event *event;
struct pci_dn *pdn;
struct device_node *pe_dn;
int rc = 0;
@@ -722,33 +627,12 @@
__eeh_mark_slot (pe_dn);
spin_unlock_irqrestore(&confirm_error_lock, flags);

- reset_state = rets[0];
-
- eeh_slot_error_detail (pdn, 1 /* Temporary Error */);
-
- printk(KERN_INFO "EEH: MMIO failure (%d) on device: %s %s\n",
- rets[0], dn->name, dn->full_name);
- event = kmalloc(sizeof(*event), GFP_ATOMIC);
- if (event == NULL) {
- eeh_panic(dev, reset_state);
- return 1;
- }
-
- event->dev = dev;
- event->dn = dn;
- event->reset_state = reset_state;
-
- /* We may or may not be called in an interrupt context */
- spin_lock_irqsave(&eeh_eventlist_lock, flags);
- list_add(&event->list, &eeh_eventlist);
- spin_unlock_irqrestore(&eeh_eventlist_lock, flags);
-
+ eeh_send_failure_event (dn, dev, rets[0], rets[2]);
+
/* Most EEH events are due to device driver bugs. Having
* a stack trace will help the device-driver authors figure
* out what happened. So print that out. */
if (rets[0] != 5) dump_stack();
- schedule_work(&eeh_event_wq);
-
return 1;

dn_unlock:
@@ -793,6 +677,14 @@

EXPORT_SYMBOL(eeh_check_failure);

+/* ------------------------------------------------------------- */
+/* The code below deals with enabling EEH for devices during the
+ * early boot sequence. EEH must be enabled before any PCI probing
+ * can be done.
+ */
+
+#define EEH_ENABLE 1
+
struct eeh_early_enable_info {
unsigned int buid_hi;
unsigned int buid_lo;
@@ -850,8 +742,9 @@
/* First register entry is addr (00BBSS00) */
/* Try to enable eeh */
ret = rtas_call(ibm_set_eeh_option, 4, 1, NULL,
- regs[0], info->buid_hi, info->buid_lo,
- EEH_ENABLE);
+ regs[0], info->buid_hi, info->buid_lo,
+ EEH_ENABLE);
+
if (ret == 0) {
eeh_subsystem_enabled = 1;
pdn->eeh_mode |= EEH_MODE_SUPPORTED;
Index: linux-2.6.14-rc2-git6/arch/ppc64/kernel/eeh_event.h
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.14-rc2-git6/arch/ppc64/kernel/eeh_event.h 2005-10-06 17:50:24.089783186 -0500
@@ -0,0 +1,52 @@
+/*
+ * eeh_event.h
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Copyright (c) 2005 Linas Vepstas <[email protected]>
+ */
+
+#ifndef ASM_PPC64_EEH_EVENT_H
+#define ASM_PPC64_EEH_EVENT_H
+
+/** EEH event -- structure holding pci controller data that describes
+ * a change in the isolation status of a PCI slot. A pointer
+ * to this struct is passed as the data pointer in a notify callback.
+ */
+struct eeh_event {
+ struct list_head list;
+ struct device_node *dn; /* struct device node */
+ struct pci_dev *dev; /* affected device */
+ int state;
+ int time_unavail; /* milliseconds until device might be available */
+};
+
+/**
+ * eeh_send_failure_event - generate a PCI error event
+ * @dev pci device
+ *
+ * This routine builds a PCI error event which will be delivered
+ * to all listeners on the peh_notifier_chain.
+ *
+ * This routine can be called within an interrupt context;
+ * the actual event will be delivered in a normal context
+ * (from a workqueue).
+ */
+int eeh_send_failure_event (struct device_node *dn,
+ struct pci_dev *dev,
+ int reset_state,
+ int time_unavail);
+
+#endif /* ASM_PPC64_EEH_EVENT_H */
Index: linux-2.6.14-rc2-git6/include/asm-ppc64/eeh.h
===================================================================
--- linux-2.6.14-rc2-git6.orig/include/asm-ppc64/eeh.h 2005-10-04 15:32:13.000000000 -0500
+++ linux-2.6.14-rc2-git6/include/asm-ppc64/eeh.h 2005-10-06 17:51:48.669915765 -0500
@@ -1,4 +1,4 @@
-/*
+/*
* eeh.h
* Copyright (C) 2001 Dave Engebretsen & Todd Inglett IBM Corporation.
*
@@ -6,12 +6,12 @@
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
- *
+ *
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
- *
+ *
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
@@ -27,8 +27,6 @@

struct pci_dev;
struct device_node;
-struct device_node;
-struct notifier_block;

#ifdef CONFIG_EEH

@@ -37,6 +35,10 @@
#define EEH_MODE_NOCHECK (1<<1)
#define EEH_MODE_ISOLATED (1<<2)

+/* Max number of EEH freezes allowed before we consider the device
+ * to be permanently disabled. */
+#define EEH_MAX_ALLOWED_FREEZES 5
+
void __init eeh_init(void);
unsigned long eeh_check_failure(const volatile void __iomem *token,
unsigned long val);
@@ -59,36 +61,14 @@
* eeh_remove_device - undo EEH setup for the indicated pci device
* @dev: pci device to be removed
*
- * This routine should be when a device is removed from a running
- * system (e.g. by hotplug or dlpar).
+ * This routine should be called when a device is removed from
+ * a running system (e.g. by hotplug or dlpar). It unregisters
+ * the PCI device from the EEH subsystem. I/O errors affecting
+ * this device will no longer be detected after this call; thus,
+ * i/o errors affecting this slot may leave this device unusable.
*/
void eeh_remove_device(struct pci_dev *);

-#define EEH_DISABLE 0
-#define EEH_ENABLE 1
-#define EEH_RELEASE_LOADSTORE 2
-#define EEH_RELEASE_DMA 3
-
-/**
- * Notifier event flags.
- */
-#define EEH_NOTIFY_FREEZE 1
-
-/** EEH event -- structure holding pci slot data that describes
- * a change in the isolation status of a PCI slot. A pointer
- * to this struct is passed as the data pointer in a notify callback.
- */
-struct eeh_event {
- struct list_head list;
- struct pci_dev *dev;
- struct device_node *dn;
- int reset_state;
-};
-
-/** Register to find out about EEH events. */
-int eeh_register_notifier(struct notifier_block *nb);
-int eeh_unregister_notifier(struct notifier_block *nb);
-
/**
* EEH_POSSIBLE_ERROR() -- test for possible MMIO failure.
*
@@ -129,7 +109,7 @@
#define EEH_IO_ERROR_VALUE(size) (-1UL)
#endif /* CONFIG_EEH */

-/*
+/*
* MMIO read/write operations with EEH support.
*/
static inline u8 eeh_readb(const volatile void __iomem *addr)
Index: linux-2.6.14-rc2-git6/arch/ppc64/kernel/eeh_event.c
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.14-rc2-git6/arch/ppc64/kernel/eeh_event.c 2005-10-06 17:50:24.089783186 -0500
@@ -0,0 +1,155 @@
+/*
+ * eeh_event.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Copyright (c) 2005 Linas Vepstas <[email protected]>
+ */
+
+#include <linux/list.h>
+#include <linux/pci.h>
+#include "eeh_event.h"
+
+/** Overview:
+ * EEH error states may be detected within exception handlers;
+ * however, the recovery processing needs to occur asynchronously
+ * in a normal kernel context and not an interrupt context.
+ * This pair of routines creates an event and queues it onto a
+ * work-queue, where a worker thread can drive recovery.
+ */
+
+/* EEH event workqueue setup. */
+static spinlock_t eeh_eventlist_lock = SPIN_LOCK_UNLOCKED;
+LIST_HEAD(eeh_eventlist);
+static void eeh_thread_launcher(void *);
+DECLARE_WORK(eeh_event_wq, eeh_thread_launcher, NULL);
+
+/**
+ * eeh_panic - call panic() for an eeh event that cannot be handled.
+ * The philosophy of this routine is that it is better to panic and
+ * halt the OS than it is to risk possible data corruption by
+ * oblivious device drivers that don't know better.
+ *
+ * @dev pci device that had an eeh event
+ * @reset_state current reset state of the device slot
+ */
+static void eeh_panic(struct pci_dev *dev, int reset_state)
+{
+ /*
+ * Since the panic_on_oops sysctl is used to halt the system
+ * in light of potential corruption, we can use it here.
+ */
+ if (panic_on_oops) {
+ panic("EEH: MMIO failure (%d) on device:%s\n", reset_state,
+ pci_name(dev));
+ }
+ else {
+ printk(KERN_INFO "EEH: Ignored MMIO failure (%d) on device:%s\n",
+ reset_state, pci_name(dev));
+ }
+}
+
+/**
+ * eeh_event_handler - dispatch EEH events. The detection of a frozen
+ * slot can occur inside an interrupt, where it can be hard to do
+ * anything about it. The goal of this routine is to pull these
+ * detection events out of the context of the interrupt handler, and
+ * re-dispatch them for processing at a later time in a normal context.
+ *
+ * @dummy - unused
+ */
+static int eeh_event_handler(void * dummy)
+{
+ unsigned long flags;
+ struct eeh_event *event;
+
+ daemonize ("eehd");
+
+ while (1) {
+ set_current_state(TASK_INTERRUPTIBLE);
+
+ spin_lock_irqsave(&eeh_eventlist_lock, flags);
+ event = NULL;
+ if (!list_empty(&eeh_eventlist)) {
+ event = list_entry(eeh_eventlist.next, struct eeh_event, list);
+ list_del(&event->list);
+ }
+ spin_unlock_irqrestore(&eeh_eventlist_lock, flags);
+ if (event == NULL)
+ break;
+
+ printk(KERN_INFO "EEH: Detected PCI bus error on device %s\n",
+ pci_name(event->dev));
+
+ eeh_panic (event->dev, event->state);
+
+ kfree(event);
+ }
+
+ return 0;
+}
+
+/**
+ * eeh_thread_launcher
+ *
+ * @dummy - unused
+ */
+static void eeh_thread_launcher(void *dummy)
+{
+ if (kernel_thread(eeh_event_handler, NULL, CLONE_KERNEL) < 0)
+ printk(KERN_ERR "Failed to start EEH daemon\n");
+}
+
+/**
+ * eeh_send_failure_event - generate a PCI error event
+ * @dev pci device
+ *
+ * This routine can be called within an interrupt context;
+ * the actual event will be delivered in a normal context
+ * (from a workqueue).
+ */
+int eeh_send_failure_event (struct device_node *dn,
+ struct pci_dev *dev,
+ int state,
+ int time_unavail)
+{
+ unsigned long flags;
+ struct eeh_event *event;
+
+ event = kmalloc(sizeof(*event), GFP_ATOMIC);
+ if (event == NULL) {
+ printk (KERN_ERR "EEH: out of memory, event not handled\n");
+ return 1;
+ }
+
+ if (dev)
+ pci_dev_get(dev);
+
+ event->dn = dn;
+ event->dev = dev;
+ event->state = state;
+ event->time_unavail = time_unavail;
+
+ /* We may or may not be called in an interrupt context */
+ spin_lock_irqsave(&eeh_eventlist_lock, flags);
+ list_add(&event->list, &eeh_eventlist);
+ spin_unlock_irqrestore(&eeh_eventlist_lock, flags);
+
+ schedule_work(&eeh_event_wq);
+
+ return 0;
+}
+
+/********************** END OF FILE ******************************/

2005-10-06 23:28:34

by linas

[permalink] [raw]
Subject: [PATCH 4/22] ppc64: EEH Recovery support routines


04-eeh-recovery-support-routines.patch

EEH Recovery support routines

This patch adds routines required to help drive the recovery of
EEH-frozen slots. The main function is to drive the PCI #RST
signal line high for a qurter of a second, and then allow for
a second & a half of settle time.

Signed-off-by: Linas Vepstas <[email protected]>


Index: linux-2.6.14-rc2-git6/arch/ppc64/kernel/pci.h
===================================================================
--- linux-2.6.14-rc2-git6.orig/arch/ppc64/kernel/pci.h 2005-10-06 17:50:31.847694674 -0500
+++ linux-2.6.14-rc2-git6/arch/ppc64/kernel/pci.h 2005-10-06 17:51:58.844488173 -0500
@@ -51,4 +51,18 @@
extern unsigned long pci_assign_all_buses;
extern int pci_read_irq_line(struct pci_dev *pci_dev);

+/* ---- EEH internal-use-only related routines ---- */
+#ifdef CONFIG_EEH
+/**
+ * rtas_set_slot_reset -- unfreeze a frozen slot
+ *
+ * Clear the EEH-frozen condition on a slot. This routine
+ * does this by asserting the PCI #RST line for 1/8th of
+ * a second; this routine will sleep while the adapter is
+ * being reset.
+ */
+void rtas_set_slot_reset (struct pci_dn *);
+
+#endif
+
#endif /* __PPC_KERNEL_PCI_H__ */
Index: linux-2.6.14-rc2-git6/arch/ppc64/kernel/eeh.c
===================================================================
--- linux-2.6.14-rc2-git6.orig/arch/ppc64/kernel/eeh.c 2005-10-06 17:50:31.847694674 -0500
+++ linux-2.6.14-rc2-git6/arch/ppc64/kernel/eeh.c 2005-10-06 17:52:27.908410223 -0500
@@ -17,6 +17,7 @@
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/

+#include <linux/delay.h>
#include <linux/init.h>
#include <linux/list.h>
#include <linux/pci.h>
@@ -678,6 +679,104 @@
EXPORT_SYMBOL(eeh_check_failure);

/* ------------------------------------------------------------- */
+/* The code below deals with error recovery */
+
+/** Return negative value if a permanent error, else return
+ * a number of milliseconds to wait until the PCI slot is
+ * ready to be used.
+ */
+static int
+eeh_slot_availability(struct pci_dn *pdn)
+{
+ int rc;
+ int rets[3];
+
+ rc = read_slot_reset_state(pdn, rets);
+
+ if (rc) return rc;
+
+ if (rets[1] == 0) return -1; /* EEH is not supported */
+ if (rets[0] == 0) return 0; /* Oll Korrect */
+ if (rets[0] == 5) {
+ if (rets[2] == 0) return -1; /* permanently unavailable */
+ return rets[2]; /* number of millisecs to wait */
+ }
+ return -1;
+}
+
+/** rtas_pci_slot_reset raises/lowers the pci #RST line
+ * state: 1/0 to raise/lower the #RST
+ *
+ * Clear the EEH-frozen condition on a slot. This routine
+ * asserts the PCI #RST line if the 'state' argument is '1',
+ * and drops the #RST line if 'state is '0'. This routine is
+ * safe to call in an interrupt context.
+ *
+ */
+
+static void
+rtas_pci_slot_reset(struct pci_dn *pdn, int state)
+{
+ int rc;
+
+ BUG_ON (pdn==NULL);
+
+ if (!pdn->phb) {
+ printk (KERN_WARNING "EEH: in slot reset, device node %s has no phb\n",
+ pdn->node->full_name);
+ return;
+ }
+
+ rc = rtas_call(ibm_set_slot_reset,4,1, NULL,
+ pdn->eeh_config_addr,
+ BUID_HI(pdn->phb->buid),
+ BUID_LO(pdn->phb->buid),
+ state);
+ if (rc) {
+ printk (KERN_WARNING "EEH: Unable to reset the failed slot, (%d) #RST=%d dn=%s\n",
+ rc, state, pdn->node->full_name);
+ return;
+ }
+
+ if (state == 0)
+ eeh_clear_slot (pdn->node->parent->child);
+}
+
+/** rtas_set_slot_reset -- assert the pci #RST line for 1/4 second
+ * dn -- device node to be reset.
+ */
+
+void
+rtas_set_slot_reset(struct pci_dn *pdn)
+{
+ int i, rc;
+
+ rtas_pci_slot_reset (pdn, 1);
+
+ /* The PCI bus requires that the reset be held high for at least
+ * a 100 milliseconds. We wait a bit longer 'just in case'. */
+
+#define PCI_BUS_RST_HOLD_TIME_MSEC 250
+ msleep (PCI_BUS_RST_HOLD_TIME_MSEC);
+ rtas_pci_slot_reset (pdn, 0);
+
+ /* After a PCI slot has been reset, the PCI Express spec requires
+ * a 1.5 second idle time for the bus to stabilize, before starting
+ * up traffic. */
+#define PCI_BUS_SETTLE_TIME_MSEC 1800
+ msleep (PCI_BUS_SETTLE_TIME_MSEC);
+
+ /* Now double check with the firmware to make sure the device is
+ * ready to be used; if not, wait for recovery. */
+ for (i=0; i<10; i++) {
+ rc = eeh_slot_availability (pdn);
+ if (rc <= 0) break;
+
+ msleep (rc+100);
+ }
+}
+
+/* ------------------------------------------------------------- */
/* The code below deals with enabling EEH for devices during the
* early boot sequence. EEH must be enabled before any PCI probing
* can be done.

2005-10-06 23:30:06

by linas

[permalink] [raw]
Subject: [PATCH 5/22] ppc64: Device BAR save and restore


05-eeh-device-bar-save.patch

After a PCI device has been resest, the device BAR's and other config
space info must be restored to the same state as they were in when
the firmware first handed us this device. This will allow the
PCI device driver, when restarted, to correctly recognize and set up
the device.

Tis patch saves the device config space as early as reasonable after
the firmware has handed over the device. Te state resore funcion
is inteded for use by the EEH recovery routines.

Signed-off-by: Linas Vepstas <[email protected]>


Index: linux-2.6.14-rc2-git6/arch/ppc64/kernel/eeh.c
===================================================================
--- linux-2.6.14-rc2-git6.orig/arch/ppc64/kernel/eeh.c 2005-10-06 17:52:27.908410223 -0500
+++ linux-2.6.14-rc2-git6/arch/ppc64/kernel/eeh.c 2005-10-06 17:52:37.399078590 -0500
@@ -78,6 +78,9 @@
*/
#define EEH_MAX_FAILS 100000

+/* Misc forward declaraions */
+static void eeh_save_bars(struct pci_dev * pdev, struct pci_dn *pdn);
+
/* RTAS tokens */
static int ibm_set_eeh_option;
static int ibm_set_slot_reset;
@@ -367,6 +370,7 @@
*/
void __init pci_addr_cache_build(void)
{
+ struct device_node *dn;
struct pci_dev *dev = NULL;

if (!eeh_subsystem_enabled)
@@ -380,6 +384,10 @@
continue;
}
pci_addr_cache_insert_device(dev);
+
+ /* Save the BAR's; firmware doesn't restore these after EEH reset */
+ dn = pci_device_to_OF_node(dev);
+ eeh_save_bars(dev, PCI_DN(dn));
}

#ifdef DEBUG
@@ -776,6 +784,108 @@
}
}

+/* ------------------------------------------------------- */
+/** Save and restore of PCI BARs
+ *
+ * Although firmware will set up BARs during boot, it doesn't
+ * set up device BAR's after a device reset, although it will,
+ * if requested, set up bridge configuration. Thus, we need to
+ * configure the PCI devices ourselves.
+ */
+
+/**
+ * __restore_bars - Restore the Base Address Registers
+ * Loads the PCI configuration space base address registers,
+ * the expansion ROM base address, the latency timer, and etc.
+ * from the saved values in the device node.
+ */
+static inline void __restore_bars (struct pci_dn *pdn)
+{
+ int i;
+
+ if (NULL==pdn->phb) return;
+ for (i=4; i<10; i++) {
+ rtas_write_config(pdn, i*4, 4, pdn->config_space[i]);
+ }
+
+ /* 12 == Expansion ROM Address */
+ rtas_write_config(pdn, 12*4, 4, pdn->config_space[12]);
+
+#define BYTE_SWAP(OFF) (8*((OFF)/4)+3-(OFF))
+#define SAVED_BYTE(OFF) (((u8 *)(pdn->config_space))[BYTE_SWAP(OFF)])
+
+ rtas_write_config (pdn, PCI_CACHE_LINE_SIZE, 1,
+ SAVED_BYTE(PCI_CACHE_LINE_SIZE));
+
+ rtas_write_config (pdn, PCI_LATENCY_TIMER, 1,
+ SAVED_BYTE(PCI_LATENCY_TIMER));
+
+ /* max latency, min grant, interrupt pin and line */
+ rtas_write_config(pdn, 15*4, 4, pdn->config_space[15]);
+}
+
+/**
+ * eeh_restore_bars - restore the PCI config space info
+ *
+ * This routine performs a recursive walk to the children
+ * of this device as well.
+ */
+void eeh_restore_bars(struct pci_dn *pdn)
+{
+ struct device_node *dn;
+ if (!pdn)
+ return;
+
+ if (! pdn->eeh_is_bridge)
+ __restore_bars (pdn);
+
+ dn = pdn->node->child;
+ while (dn) {
+ eeh_restore_bars (PCI_DN(dn));
+ dn = dn->sibling;
+ }
+}
+
+/**
+ * eeh_save_bars - save device bars
+ *
+ * Save the values of the device bars. Unlike the restore
+ * routine, this routine is *not* recursive. This is because
+ * PCI devices are added individuallly; but, for the restore,
+ * an entire slot is reset at a time.
+ */
+static void eeh_save_bars(struct pci_dev * pdev, struct pci_dn *pdn)
+{
+ int i;
+
+ if (!pdev || !pdn )
+ return;
+
+ for (i = 0; i < 16; i++)
+ pci_read_config_dword(pdev, i * 4, &pdn->config_space[i]);
+
+ if (pdev->hdr_type == PCI_HEADER_TYPE_BRIDGE)
+ pdn->eeh_is_bridge = 1;
+}
+
+void
+rtas_configure_bridge(struct pci_dn *pdn)
+{
+ int token = rtas_token ("ibm,configure-bridge");
+ int rc;
+
+ if (token == RTAS_UNKNOWN_SERVICE)
+ return;
+ rc = rtas_call(token,3,1, NULL,
+ pdn->eeh_config_addr,
+ BUID_HI(pdn->phb->buid),
+ BUID_LO(pdn->phb->buid));
+ if (rc) {
+ printk (KERN_WARNING "EEH: Unable to configure device bridge (%d) for %s\n",
+ rc, pdn->node->full_name);
+ }
+}
+
/* ------------------------------------------------------------- */
/* The code below deals with enabling EEH for devices during the
* early boot sequence. EEH must be enabled before any PCI probing
@@ -978,6 +1088,7 @@
void eeh_add_device_late(struct pci_dev *dev)
{
struct device_node *dn;
+ struct pci_dn *pdn;

if (!dev || !eeh_subsystem_enabled)
return;
@@ -988,9 +1099,11 @@

pci_dev_get (dev);
dn = pci_device_to_OF_node(dev);
- PCI_DN(dn)->pcidev = dev;
+ pdn = PCI_DN(dn);
+ pdn->pcidev = dev;

pci_addr_cache_insert_device (dev);
+ eeh_save_bars(dev, pdn);
}
EXPORT_SYMBOL_GPL(eeh_add_device_late);

Index: linux-2.6.14-rc2-git6/arch/ppc64/kernel/pci.h
===================================================================
--- linux-2.6.14-rc2-git6.orig/arch/ppc64/kernel/pci.h 2005-10-06 17:51:58.844488173 -0500
+++ linux-2.6.14-rc2-git6/arch/ppc64/kernel/pci.h 2005-10-06 17:52:37.399078590 -0500
@@ -63,6 +63,29 @@
*/
void rtas_set_slot_reset (struct pci_dn *);

+/**
+ * eeh_restore_bars - Restore device configuration info.
+ *
+ * A reset of a PCI device will clear out its config space.
+ * This routines will restore the config space for this
+ * device, and is children, to values previously obtained
+ * from the firmware.
+ */
+void eeh_restore_bars(struct pci_dn *);
+
+/**
+ * rtas_configure_bridge -- firmware initialization of pci bridge
+ *
+ * Ask the firmware to configure all PCI bridges devices
+ * located behind the indicated node. Required after a
+ * pci device reset. Does essentially the same hing as
+ * eeh_restore_bars, but for brdges, and lets firmware
+ * do the work.
+ */
+void rtas_configure_bridge(struct pci_dn *);
+
+int rtas_write_config(struct pci_dn *, int where, int size, u32 val);
+
#endif

#endif /* __PPC_KERNEL_PCI_H__ */

2005-10-06 23:31:11

by linas

[permalink] [raw]
Subject: [PATCH 6/22] ppc64: PCI Error Recovery: documentation patch


PCI Error Recovery: documentation patch

Various PCI bus errors can be signaled by newer PCI controllers.
Recovering from those errors requires an infrastructure to notify
affected device drivers of the error, and a way of walking through
a reset sequence. This patch adds documentation describing the
current error recovery proposal.

Signed-off-by: Linas Vepstas <[email protected]>

Documentation/pci-error-recovery.txt | 246 +++++++++++++++++++++++++++++++++++
MAINTAINERS | 7
2 files changed, 253 insertions(+)

Index: linux-2.6.14-rc2-git6/Documentation/pci-error-recovery.txt
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.14-rc2-git6/Documentation/pci-error-recovery.txt 2005-10-06 17:52:47.274692945 -0500
@@ -0,0 +1,246 @@
+
+ PCI Error Recovery
+ ------------------
+ May 31, 2005
+
+ Current document maintainer:
+ Linas Vepstas <[email protected]>
+
+
+Some PCI bus controllers are able to detect certain "hard" PCI errors
+on the bus, such as parity errors on the data and address busses, as
+well as SERR and PERR errors. These chipsets are then able to disable
+I/O to/from the affected device, so that, for example, a bad DMA
+address doesn't end up corrupting system memory. These same chipsets
+are also able to reset the affected PCI device, and return it to
+working condition. This document describes a generic API form
+performing error recovery.
+
+The core idea is that after a PCI error has been detected, there must
+be a way for the kernel to coordinate with all affected device drivers
+so that the pci card can be made operational again, possibly after
+performing a full electrical #RST of the PCI card. The API below
+provides a generic API for device drivers to be notified of PCI
+errors, and to be notified of, and respond to, a reset sequence.
+
+Preliminary sketch of API, cut-n-pasted-n-modified email from
+Ben Herrenschmidt, circa 5 april 2005
+
+The error recovery API support is exposed to the driver in the form of
+a structure of function pointers pointed to by a new field in struct
+pci_driver. The absence of this pointer in pci_driver denotes an
+"non-aware" driver, behaviour on these is platform dependant.
+Platforms like ppc64 can try to simulate pci hotplug remove/add.
+
+The definition of "pci_error_token" is not covered here. It is based on
+Seto's work on the synchronous error detection. We still need to define
+functions for extracting infos out of an opaque error token. This is
+separate from this API.
+
+This structure has the form:
+
+struct pci_error_handlers
+{
+ int (*error_detected)(struct pci_dev *dev, pci_error_token error);
+ int (*mmio_enabled)(struct pci_dev *dev);
+ int (*resume)(struct pci_dev *dev);
+ int (*link_reset)(struct pci_dev *dev);
+ int (*slot_reset)(struct pci_dev *dev);
+};
+
+A driver doesn't have to implement all of these callbacks. The
+only mandatory one is error_detected(). If a callback is not
+implemented, the corresponding feature is considered unsupported.
+For example, if mmio_enabled() and resume() aren't there, then the
+driver is assumed as not doing any direct recovery and requires
+a reset. If link_reset() is not implemented, the card is assumed as
+not caring about link resets, in which case, if recover is supported,
+the core can try recover (but not slot_reset() unless it really did
+reset the slot). If slot_reset() is not supported, link_reset() can
+be called instead on a slot reset.
+
+At first, the call will always be :
+
+ 1) error_detected()
+
+ Error detected. This is sent once after an error has been detected. At
+this point, the device might not be accessible anymore depending on the
+platform (the slot will be isolated on ppc64). The driver may already
+have "noticed" the error because of a failing IO, but this is the proper
+"synchronisation point", that is, it gives a chance to the driver to
+cleanup, waiting for pending stuff (timers, whatever, etc...) to
+complete; it can take semaphores, schedule, etc... everything but touch
+the device. Within this function and after it returns, the driver
+shouldn't do any new IOs. Called in task context. This is sort of a
+"quiesce" point. See note about interrupts at the end of this doc.
+
+ Result codes:
+ - PCIERR_RESULT_CAN_RECOVER:
+ Driever returns this if it thinks it might be able to recover
+ the HW by just banging IOs or if it wants to be given
+ a chance to extract some diagnostic informations (see
+ below).
+ - PCIERR_RESULT_NEED_RESET:
+ Driver returns this if it thinks it can't recover unless the
+ slot is reset.
+ - PCIERR_RESULT_DISCONNECT:
+ Return this if driver thinks it won't recover at all,
+ (this will detach the driver ? or just leave it
+ dangling ? to be decided)
+
+So at this point, we have called error_detected() for all drivers
+on the segment that had the error. On ppc64, the slot is isolated. What
+happens now typically depends on the result from the drivers. If all
+drivers on the segment/slot return PCIERR_RESULT_CAN_RECOVER, we would
+re-enable IOs on the slot (or do nothing special if the platform doesn't
+isolate slots) and call 2). If not and we can reset slots, we go to 4),
+if neither, we have a dead slot. If it's an hotplug slot, we might
+"simulate" reset by triggering HW unplug/replug though.
+
+>>> Current ppc64 implementation assumes that a device driver will
+>>> *not* schedule or semaphore in this routine; the current ppc64
+>>> implementation uses one kernel thread to notify all devices;
+>>> thus, of one device sleeps/schedules, all devices are affected.
+>>> Doing better requires complex multi-threaded logic in the error
+>>> recovery implementation (e.g. waiting for all notification threads
+>>> to "join" before proceeding with recovery.) This seems excessively
+>>> complex and not worth implementing.
+
+>>> The current ppc64 implementation doesn't much care if the device
+>>> attempts i/o at this point, or not. I/O's will fail, returning
+>>> a value of 0xff on read, and writes will be dropped. If the device
+>>> driver attempts more than 10K I/O's to a frozen adapter, it will
+>>> assume that the device driver has gone into an infinite loop, and
+>>> it will panic the the kernel.
+
+ 2) mmio_enabled()
+
+ This is the "early recovery" call. IOs are allowed again, but DMA is
+not (hrm... to be discussed, I prefer not), with some restrictions. This
+is NOT a callback for the driver to start operations again, only to
+peek/poke at the device, extract diagnostic information, if any, and
+eventually do things like trigger a device local reset or some such,
+but not restart operations. This is sent if all drivers on a segment
+agree that they can try to recover and no automatic link reset was
+performed by the HW. If the platform can't just re-enable IOs without
+a slot reset or a link reset, it doesn't call this callback and goes
+directly to 3) or 4). All IOs should be done _synchronously_ from
+within this callback, errors triggered by them will be returned via
+the normal pci_check_whatever() api, no new error_detected() callback
+will be issued due to an error happening here. However, such an error
+might cause IOs to be re-blocked for the whole segment, and thus
+invalidate the recovery that other devices on the same segment might
+have done, forcing the whole segment into one of the next states,
+that is link reset or slot reset.
+
+ Result codes:
+ - PCIERR_RESULT_RECOVERED
+ Driver returns this if it thinks the device is fully
+ functionnal and thinks it is ready to start
+ normal driver operations again. There is no
+ guarantee that the driver will actually be
+ allowed to proceed, as another driver on the
+ same segment might have failed and thus triggered a
+ slot reset on platforms that support it.
+
+ - PCIERR_RESULT_NEED_RESET
+ Driver returns this if it thinks the device is not
+ recoverable in it's current state and it needs a slot
+ reset to proceed.
+
+ - PCIERR_RESULT_DISCONNECT
+ Same as above. Total failure, no recovery even after
+ reset driver dead. (To be defined more precisely)
+
+>>> The current ppc64 implementation does not implement this callback.
+
+ 3) link_reset()
+
+ This is called after the link has been reset. This is typically
+a PCI Express specific state at this point and is done whenever a
+non-fatal error has been detected that can be "solved" by resetting
+the link. This call informs the driver of the reset and the driver
+should check if the device appears to be in working condition.
+This function acts a bit like 2) mmio_enabled(), in that the driver
+is not supposed to restart normal driver I/O operations right away.
+Instead, it should just "probe" the device to check it's recoverability
+status. If all is right, then the core will call resume() once all
+drivers have ack'd link_reset().
+
+ Result codes:
+ (identical to mmio_enabled)
+
+>>> The current ppc64 implementation does not implement this callback.
+
+ 4) slot_reset()
+
+ This is called after the slot has been soft or hard reset by the
+platform. A soft reset consists of asserting the adapter #RST line
+and then restoring the PCI BARs and PCI configuration header. If the
+platform supports PCI hotplug, then it might instead perform a hard
+reset by toggling power on the slot off/on. This call gives drivers
+the chance to re-initialize the hardware (re-download firmware, etc.),
+but drivers shouldn't restart normal I/O processing operations at
+this point. (See note about interrupts; interrupts aren't guaranteed
+to be delivered until the resume() callback has been called). If all
+device drivers report success on this callback, the patform will call
+resume() to complete the error handling and let the driver restart
+normal I/O processing.
+
+A driver can still return a critical failure for this function if
+it can't get the device operational after reset. If the platform
+previously tried a soft reset, it migh now try a hard reset (power
+cycle) and then call slot_reset() again. It the device still can't
+be recovered, there is nothing more that can be done; the platform
+will typically report a "permanent failure" in such a case. The
+device will be considered "dead" in this case.
+
+ Result codes:
+ - PCIERR_RESULT_DISCONNECT
+ Same as above.
+
+>>> The current ppc64 implementation does not try a power-cycle reset
+>>> if the driver returned PCIERR_RESULT_DISCONNECT. However, it should.
+
+ 5) resume()
+
+ This is called if all drivers on the segment have returned
+PCIERR_RESULT_RECOVERED from one of the 3 prevous callbacks.
+That basically tells the driver to restart activity, tht everything
+is back and running. No result code is taken into account here. If
+a new error happens, it will restart a new error handling process.
+
+That's it. I think this covers all the possibilities. The way those
+callbacks are called is platform policy. A platform with no slot reset
+capability for example may want to just "ignore" drivers that can't
+recover (disconnect them) and try to let other cards on the same segment
+recover. Keep in mind that in most real life cases, though, there will
+be only one driver per segment.
+
+Now, there is a note about interrupts. If you get an interrupt and your
+device is dead or has been isolated, there is a problem :)
+
+After much thinking, I decided to leave that to the platform. That is,
+the recovery API only precies that:
+
+ - There is no guarantee that interrupt delivery can proceed from any
+device on the segment starting from the error detection and until the
+restart callback is sent, at which point interrupts are expected to be
+fully operational.
+
+ - There is no guarantee that interrupt delivery is stopped, that is, ad
+river that gets an interrupts after detecting an error, or that detects
+and error within the interrupt handler such that it prevents proper
+ack'ing of the interrupt (and thus removal of the source) should just
+return IRQ_NOTHANDLED. It's up to the platform to deal with taht
+condition, typically by masking the irq source during the duration of
+the error handling. It is expected that the platform "knows" which
+interrupts are routed to error-management capable slots and can deal
+with temporarily disabling that irq number during error processing (this
+isn't terribly complex). That means some IRQ latency for other devices
+sharing the interrupt, but there is simply no other way. High end
+platforms aren't supposed to share interrupts between many devices
+anyway :)
+
+
+Revised: 31 May 2005 Linas Vepstas <[email protected]>
Index: linux-2.6.14-rc2-git6/MAINTAINERS
===================================================================
--- linux-2.6.14-rc2-git6.orig/MAINTAINERS 2005-10-06 17:50:30.073943549 -0500
+++ linux-2.6.14-rc2-git6/MAINTAINERS 2005-10-06 17:52:47.296689858 -0500
@@ -1859,6 +1859,13 @@
L: [email protected]
S: Maintained

+PCI ERROR RECOVERY
+P: Linas Vepstas
+M: [email protected]
+L: [email protected]
+L: [email protected]
+S: Supported
+
PCI SOUND DRIVERS (ES1370, ES1371 and SONICVIBES)
P: Thomas Sailer
M: [email protected]

2005-10-06 23:32:15

by linas

[permalink] [raw]
Subject: [PATCH 7/22] PCI Error Recovery: header file patch


PCI Error Recovery: header file patch

Various PCI bus errors can be signaled by newer PCI controllers. Recovering
from those errors requires an infrastructure to notify affected device drivers
of the error, and a way of walking through a reset sequence. This patch adds
a set of callbacks to be used by error recovery routines to notify device
drivers of the various stages of recovery.

Signed-off-by: Linas Vepstas <[email protected]>

--
include/linux/pci.h | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 49 insertions(+)

Index: linux-2.6.14-rc2-git6/include/linux/pci.h
===================================================================
--- linux-2.6.14-rc2-git6.orig/include/linux/pci.h 2005-10-06 17:50:29.442032212 -0500
+++ linux-2.6.14-rc2-git6/include/linux/pci.h 2005-10-06 17:52:50.634221570 -0500
@@ -78,6 +78,16 @@
#define PCI_UNKNOWN ((pci_power_t __force) 5)
#define PCI_POWER_ERROR ((pci_power_t __force) -1)

+/** The pci_channel state describes connectivity between the CPU and
+ * the pci device. If some PCI bus between here and the pci device
+ * has crashed or locked up, this info is reflected here.
+ */
+enum pci_channel_state {
+ pci_channel_io_normal = 0, /* I/O channel is in normal state */
+ pci_channel_io_frozen = 1, /* I/O to channel is blocked */
+ pci_channel_io_perm_failure, /* PCI card is dead */
+};
+
/*
* The pci_dev structure is used to describe PCI devices.
*/
@@ -110,6 +120,7 @@
this is D0-D3, D0 being fully functional,
and D3 being off. */

+ enum pci_channel_state error_state; /* current connectivity state */
struct device dev; /* Generic device interface */

/* device is compatible with these IDs */
@@ -231,6 +242,43 @@
unsigned int use_driver_data:1; /* pci_driver->driver_data is used */
};

+/* ---------------------------------------------------------------- */
+/** PCI error recovery infrastructure. If a PCI device driver provides
+ * a set fof callbacks in struct pci_error_handlers, then that device driver
+ * will be notified of PCI bus errors, and will be driven to recovery
+ * when an error occurs.
+ */
+
+enum pcierr_result {
+ PCIERR_RESULT_NONE=0, /* no result/none/not supported in device driver */
+ PCIERR_RESULT_CAN_RECOVER=1, /* Device driver can recover without slot reset */
+ PCIERR_RESULT_NEED_RESET, /* Device driver wants slot to be reset. */
+ PCIERR_RESULT_DISCONNECT, /* Device has completely failed, is unrecoverable */
+ PCIERR_RESULT_RECOVERED, /* Device driver is fully recovered and operational */
+};
+
+/* PCI bus error event callbacks */
+struct pci_error_handlers
+{
+ /* PCI bus error detected on this device */
+ int (*error_detected)(struct pci_dev *dev,
+ enum pci_channel_state error);
+
+ /* MMIO has been re-enabled, but not DMA */
+ int (*mmio_enabled)(struct pci_dev *dev);
+
+ /* PCI Express link has been reset */
+ int (*link_reset)(struct pci_dev *dev);
+
+ /* PCI slot has been reset */
+ int (*slot_reset)(struct pci_dev *dev);
+
+ /* Device driver may resume normal operations */
+ void (*resume)(struct pci_dev *dev);
+};
+
+/* ---------------------------------------------------------------- */
+
struct module;
struct pci_driver {
struct list_head node;
@@ -244,6 +292,7 @@
int (*enable_wake) (struct pci_dev *dev, pci_power_t state, int enable); /* Enable wake event */
void (*shutdown) (struct pci_dev *dev);

+ struct pci_error_handlers *err_handler;
struct device_driver driver;
struct pci_dynids dynids;
};

2005-10-06 23:33:25

by linas

[permalink] [raw]
Subject: [PATCH 8/22] ppc64: Slot Marking Bugfix


08-eeh-slot-marking-bug.patch

A device that experiences a PCI outage may be just one deivce out
of many that was affected. In order to avoid repeated reports of
a failure, the entire tree of affected devices should be marked
as failed. This patch marks up the entire tree.

Signed-off-by: Linas Vepstas <[email protected]>


Index: linux-2.6.14-rc2-git6/arch/ppc64/kernel/eeh.c
===================================================================
--- linux-2.6.14-rc2-git6.orig/arch/ppc64/kernel/eeh.c 2005-10-06 17:52:37.399078590 -0500
+++ linux-2.6.14-rc2-git6/arch/ppc64/kernel/eeh.c 2005-10-06 17:53:02.164603746 -0500
@@ -480,32 +480,47 @@
* an interrupt context, which is bad.
*/

-static inline void __eeh_mark_slot (struct device_node *dn)
+static inline void __eeh_mark_slot (struct device_node *dn, int mode_flag)
{
while (dn) {
- PCI_DN(dn)->eeh_mode |= EEH_MODE_ISOLATED;
+ if (PCI_DN(dn)) {
+ PCI_DN(dn)->eeh_mode |= mode_flag;

- if (dn->child)
- __eeh_mark_slot (dn->child);
+ if (dn->child)
+ __eeh_mark_slot (dn->child, mode_flag);
+ }
dn = dn->sibling;
}
}

-static inline void __eeh_clear_slot (struct device_node *dn)
+void eeh_mark_slot (struct device_node *dn, int mode_flag)
+{
+ dn = find_device_pe (dn);
+ PCI_DN(dn)->eeh_mode |= mode_flag;
+ __eeh_mark_slot (dn->child, mode_flag);
+}
+
+static inline void __eeh_clear_slot (struct device_node *dn, int mode_flag)
{
while (dn) {
- PCI_DN(dn)->eeh_mode &= ~EEH_MODE_ISOLATED;
- if (dn->child)
- __eeh_clear_slot (dn->child);
+ if (PCI_DN(dn)) {
+ PCI_DN(dn)->eeh_mode &= ~mode_flag;
+ PCI_DN(dn)->eeh_check_count = 0;
+ if (dn->child)
+ __eeh_clear_slot (dn->child, mode_flag);
+ }
dn = dn->sibling;
}
}

-static inline void eeh_clear_slot (struct device_node *dn)
+void eeh_clear_slot (struct device_node *dn, int mode_flag)
{
unsigned long flags;
spin_lock_irqsave(&confirm_error_lock, flags);
- __eeh_clear_slot (dn);
+ dn = find_device_pe (dn);
+ PCI_DN(dn)->eeh_mode &= ~mode_flag;
+ PCI_DN(dn)->eeh_check_count = 0;
+ __eeh_clear_slot (dn->child, mode_flag);
spin_unlock_irqrestore(&confirm_error_lock, flags);
}

@@ -530,7 +545,6 @@
int rets[3];
unsigned long flags;
struct pci_dn *pdn;
- struct device_node *pe_dn;
int rc = 0;

__get_cpu_var(total_mmio_ffs)++;
@@ -632,8 +646,7 @@
/* Avoid repeated reports of this failure, including problems
* with other functions on this device, and functions under
* bridges. */
- pe_dn = find_device_pe (dn);
- __eeh_mark_slot (pe_dn);
+ eeh_mark_slot (dn, EEH_MODE_ISOLATED);
spin_unlock_irqrestore(&confirm_error_lock, flags);

eeh_send_failure_event (dn, dev, rets[0], rets[2]);
@@ -745,9 +758,6 @@
rc, state, pdn->node->full_name);
return;
}
-
- if (state == 0)
- eeh_clear_slot (pdn->node->parent->child);
}

/** rtas_set_slot_reset -- assert the pci #RST line for 1/4 second
@@ -766,6 +776,12 @@

#define PCI_BUS_RST_HOLD_TIME_MSEC 250
msleep (PCI_BUS_RST_HOLD_TIME_MSEC);
+
+ /* We might get hit with another EEH freeze as soon as the
+ * pci slot reset line is dropped. Make sure we don't miss
+ * these, and clear the flag now. */
+ eeh_clear_slot (pdn->node, EEH_MODE_ISOLATED);
+
rtas_pci_slot_reset (pdn, 0);

/* After a PCI slot has been reset, the PCI Express spec requires
Index: linux-2.6.14-rc2-git6/arch/ppc64/kernel/pci.h
===================================================================
--- linux-2.6.14-rc2-git6.orig/arch/ppc64/kernel/pci.h 2005-10-06 17:52:37.399078590 -0500
+++ linux-2.6.14-rc2-git6/arch/ppc64/kernel/pci.h 2005-10-06 17:53:02.165603605 -0500
@@ -86,6 +86,13 @@

int rtas_write_config(struct pci_dn *, int where, int size, u32 val);

+/**
+ * mark and clear slots: find "partition endpoint" PE and set or
+ * clear the flags for each subnode of the PE.
+ */
+void eeh_mark_slot (struct device_node *dn, int mode_flag);
+void eeh_clear_slot (struct device_node *dn, int mode_flag);
+
#endif

#endif /* __PPC_KERNEL_PCI_H__ */

2005-10-06 23:35:07

by linas

[permalink] [raw]
Subject: [PATCH 9/22] ppc64: DLPAR slot add and remove bugfixes


09-crash-on-pci-slot-remove.patch

This patch fixes two bugs related to dlpar slot removal and add.

-- Both crashes are due to the fact the some children
of pci nodes are not pci nodes themselves, and thus do not
have pci_dn structures. For example:
/pci@800000020000002/pci@2,3/usb@1/hub@1
/pci@800000020000002/pci@2,3/usb@1,1/hub@1

Strangely, though, sometimes the following appears,
and I don't quite understand why.
/interrupt-controller@3fe0000a400

A typical stack trace:
Vector: 300 (Data Access) at [c0000000555637d0]
pc: c000000000202a50: .dlpar_add_slot+0x108/0x410
c000000000202e78 .add_slot_store+0x7c/0xac
c000000000202da0 .dlpar_attr_store+0x48/0x64
c0000000000f8ee4 .sysfs_write_file+0x100/0x1a0

A similar stack trace is involved for the slot remove.

This code survived testing, of adding and removing different slots,
23 times each, so far, as of this writing.

Signed-off-by: Linas Vepstas <[email protected]>

Index: linux-2.6.14-rc2-git6/arch/ppc64/kernel/pSeries_iommu.c
===================================================================
--- linux-2.6.14-rc2-git6.orig/arch/ppc64/kernel/pSeries_iommu.c 2005-10-06 17:50:28.197206873 -0500
+++ linux-2.6.14-rc2-git6/arch/ppc64/kernel/pSeries_iommu.c 2005-10-06 17:53:46.650361968 -0500
@@ -478,10 +478,13 @@
{
int err = NOTIFY_OK;
struct device_node *np = node;
- struct pci_dn *pci = np->data;
+ struct pci_dn *pci;

switch (action) {
case PSERIES_RECONFIG_REMOVE:
+ pci = PCI_DN(np);
+ if (!pci)
+ return NOTIFY_OK;
if (pci->iommu_table &&
get_property(np, "ibm,dma-window", NULL))
iommu_free_table(np);
Index: linux-2.6.14-rc2-git6/arch/ppc64/kernel/pci_dn.c
===================================================================
--- linux-2.6.14-rc2-git6.orig/arch/ppc64/kernel/pci_dn.c 2005-10-06 17:50:28.198206733 -0500
+++ linux-2.6.14-rc2-git6/arch/ppc64/kernel/pci_dn.c 2005-10-06 17:53:46.660360565 -0500
@@ -195,7 +195,10 @@

switch (action) {
case PSERIES_RECONFIG_ADD:
- pci = np->parent->data;
+ pci = PCI_DN(np->parent);
+ if (!pci)
+ return NOTIFY_OK;
+
update_dn_pci_info(np, pci->phb);
break;
default:

2005-10-06 23:36:47

by linas

[permalink] [raw]
Subject: [PATCH 10/22] ppc64: Crash on DLPAR PHB add


10-rpaphp-crashing.patch

This patch fixes a bug related to dlpar PHB add, after a PHB removal.

-- The crash was due to the PHB not having a pci_dn structure yet,
when the phb is being added.

This code survived testing, of adding and removeig the PHB and all slots
underneath it, 17 times so far, as of this writing.

Signed-off-by: Linas Vepstas <[email protected]>

Index: linux-2.6.14-rc2-git6/drivers/pci/hotplug/rpadlpar_core.c
===================================================================
--- linux-2.6.14-rc2-git6.orig/drivers/pci/hotplug/rpadlpar_core.c 2005-10-06 17:50:27.631286278 -0500
+++ linux-2.6.14-rc2-git6/drivers/pci/hotplug/rpadlpar_core.c 2005-10-06 17:53:50.226860151 -0500
@@ -303,7 +303,7 @@
{
struct pci_controller *phb;

- if (PCI_DN(dn)->phb) {
+ if (PCI_DN(dn) && PCI_DN(dn)->phb) {
/* PHB already exists */
return -EINVAL;
}

2005-10-06 23:39:25

by linas

[permalink] [raw]
Subject: [PATCH 11/22] ppc64: RPA PHP and EEH common code


11-rpaphp-eeh-cleanup.patch

This patch move some code from the rpaphp directory, to the ppc64 directory,
where it should have been all along (Among other things, I need it in the
ppc64 directory for the PCI error recovery.)

Please note that patch affects TWO maintainers: Paul, after applying
the ppc64 part, please ask that GregKH appli the PCI part. It is safe
to have the ppc64 part go in first. It would be bad to have the
PCI part go in first.

Signed-off-by: Linas Vepstas <[email protected]>

Index: linux-2.6.14-rc2-git6/arch/ppc64/kernel/eeh.c
===================================================================
--- linux-2.6.14-rc2-git6.orig/arch/ppc64/kernel/eeh.c 2005-10-06 17:53:02.164603746 -0500
+++ linux-2.6.14-rc2-git6/arch/ppc64/kernel/eeh.c 2005-10-06 17:53:52.475544639 -0500
@@ -1094,6 +1094,15 @@
}
EXPORT_SYMBOL_GPL(eeh_add_device_early);

+void eeh_add_device_tree_early(struct device_node *dn)
+{
+ struct device_node *sib;
+ for (sib = dn->child; sib; sib = sib->sibling)
+ eeh_add_device_tree_early(sib);
+ eeh_add_device_early(dn);
+}
+EXPORT_SYMBOL_GPL(eeh_add_device_tree_early);
+
/**
* eeh_add_device_late - perform EEH initialization for the indicated pci device
* @dev: pci device for which to set up EEH
@@ -1148,6 +1157,23 @@
}
EXPORT_SYMBOL_GPL(eeh_remove_device);

+void eeh_remove_bus_device(struct pci_dev *dev)
+{
+ eeh_remove_device(dev);
+ if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) {
+ struct pci_bus *bus = dev->subordinate;
+ struct list_head *ln;
+ if (!bus)
+ return;
+ for (ln = bus->devices.next; ln != &bus->devices; ln = ln->next) {
+ struct pci_dev *pdev = pci_dev_b(ln);
+ if (pdev)
+ eeh_remove_bus_device(pdev);
+ }
+ }
+}
+EXPORT_SYMBOL_GPL(eeh_remove_bus_device);
+
static int proc_eeh_show(struct seq_file *m, void *v)
{
unsigned int cpu;
Index: linux-2.6.14-rc2-git6/include/asm-ppc64/eeh.h
===================================================================
--- linux-2.6.14-rc2-git6.orig/include/asm-ppc64/eeh.h 2005-10-06 17:51:48.669915765 -0500
+++ linux-2.6.14-rc2-git6/include/asm-ppc64/eeh.h 2005-10-06 17:53:52.476544499 -0500
@@ -55,6 +55,7 @@
* to finish the eeh setup for this device.
*/
void eeh_add_device_early(struct device_node *);
+void eeh_add_device_tree_early(struct device_node *);
void eeh_add_device_late(struct pci_dev *);

/**
@@ -70,6 +71,15 @@
void eeh_remove_device(struct pci_dev *);

/**
+ * eeh_remove_device_recursive - undo EEH for device & children.
+ * @dev: pci device to be removed
+ *
+ * As above, this removes the device; it also removes child
+ * pci devices as well.
+ */
+void eeh_remove_bus_device(struct pci_dev *);
+
+/**
* EEH_POSSIBLE_ERROR() -- test for possible MMIO failure.
*
* If this macro yields TRUE, the caller relays to eeh_check_failure()
Index: linux-2.6.14-rc2-git6/drivers/pci/hotplug/rpaphp_pci.c
===================================================================
--- linux-2.6.14-rc2-git6.orig/drivers/pci/hotplug/rpaphp_pci.c 2005-10-06 17:50:27.039369330 -0500
+++ linux-2.6.14-rc2-git6/drivers/pci/hotplug/rpaphp_pci.c 2005-10-06 17:53:52.477544359 -0500
@@ -251,17 +251,6 @@
return dev;
}

-static void enable_eeh(struct device_node *dn)
-{
- struct device_node *sib;
-
- for (sib = dn->child; sib; sib = sib->sibling)
- enable_eeh(sib);
- eeh_add_device_early(dn);
- return;
-
-}
-
static void print_slot_pci_funcs(struct pci_bus *bus)
{
struct device_node *dn;
@@ -287,7 +276,7 @@
if (!dn)
goto exit;

- enable_eeh(dn);
+ eeh_add_device_tree_early(dn);
dev = rpaphp_pci_config_slot(bus);
if (!dev) {
err("%s: can't find any devices.\n", __FUNCTION__);
@@ -301,30 +290,12 @@
}
EXPORT_SYMBOL_GPL(rpaphp_config_pci_adapter);

-static void rpaphp_eeh_remove_bus_device(struct pci_dev *dev)
-{
- eeh_remove_device(dev);
- if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) {
- struct pci_bus *bus = dev->subordinate;
- struct list_head *ln;
- if (!bus)
- return;
- for (ln = bus->devices.next; ln != &bus->devices; ln = ln->next) {
- struct pci_dev *pdev = pci_dev_b(ln);
- if (pdev)
- rpaphp_eeh_remove_bus_device(pdev);
- }
-
- }
- return;
-}
-
int rpaphp_unconfig_pci_adapter(struct pci_bus *bus)
{
struct pci_dev *dev, *tmp;

list_for_each_entry_safe(dev, tmp, &bus->devices, bus_list) {
- rpaphp_eeh_remove_bus_device(dev);
+ eeh_remove_bus_device(dev);
pci_remove_bus_device(dev);
}
return 0;

2005-10-06 23:40:47

by linas

[permalink] [raw]
Subject: [PATCH 12/22] ppc64: RPA PHP cleanup


12-rpaphp-cleanup.patch

This patch cleans up some rpa dlpar code. Basically,
the rpaphp_config_pci_adapter() was a wrapper routine, which
made two calls, and wrapped a bunch of verbose no-op code
around it. This was consolidated wih the routine it called.

Signed-off-by: Linas Vepstas <[email protected]>

Index: linux-2.6.14-rc2-git6/drivers/pci/hotplug/rpaphp_pci.c
===================================================================
--- linux-2.6.14-rc2-git6.orig/drivers/pci/hotplug/rpaphp_pci.c 2005-10-06 17:53:52.477544359 -0500
+++ linux-2.6.14-rc2-git6/drivers/pci/hotplug/rpaphp_pci.c 2005-10-06 17:53:55.542114371 -0500
@@ -219,18 +219,21 @@
rpaphp_pci_config_slot() will configure all devices under the
given slot->dn and return the the first pci_dev.
*****************************************************************************/
-static struct pci_dev *
-rpaphp_pci_config_slot(struct pci_bus *bus)
+int
+rpaphp_config_pci_adapter(struct pci_bus *bus)
{
struct device_node *dn = pci_bus_to_OF_node(bus);
struct pci_dev *dev = NULL;
+ int rc = -ENODEV;
int slotno;
int num;

dbg("Enter %s: dn=%s bus=%s\n", __FUNCTION__, dn->full_name, bus->name);
if (!dn || !dn->child)
- return NULL;
+ goto exit;

+ eeh_add_device_tree_early(dn);
+
slotno = PCI_SLOT(PCI_DN(dn->child)->devfn);

/* pci_scan_slot should find all children */
@@ -241,15 +244,23 @@
}
if (list_empty(&bus->devices)) {
err("%s: No new device found\n", __FUNCTION__);
- return NULL;
+ goto exit;
}
list_for_each_entry(dev, &bus->devices, bus_list) {
if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE)
rpaphp_pci_config_bridge(dev);
}

- return dev;
+ dbg("%s: pci_devs of slot[%s]\n", __FUNCTION__, dn->full_name);
+ list_for_each_entry (dev, &bus->devices, bus_list)
+ dbg("\t%s\n", pci_name(dev));
+
+ rc = 0;
+exit:
+ dbg("Exit %s: rc=%d\n", __FUNCTION__, rc);
+ return rc;
}
+EXPORT_SYMBOL_GPL(rpaphp_config_pci_adapter);

static void print_slot_pci_funcs(struct pci_bus *bus)
{
@@ -266,30 +277,6 @@
return;
}

-int rpaphp_config_pci_adapter(struct pci_bus *bus)
-{
- struct device_node *dn = pci_bus_to_OF_node(bus);
- struct pci_dev *dev;
- int rc = -ENODEV;
-
- dbg("Entry %s: slot[%s]\n", __FUNCTION__, dn->full_name);
- if (!dn)
- goto exit;
-
- eeh_add_device_tree_early(dn);
- dev = rpaphp_pci_config_slot(bus);
- if (!dev) {
- err("%s: can't find any devices.\n", __FUNCTION__);
- goto exit;
- }
- print_slot_pci_funcs(bus);
- rc = 0;
-exit:
- dbg("Exit %s: rc=%d\n", __FUNCTION__, rc);
- return rc;
-}
-EXPORT_SYMBOL_GPL(rpaphp_config_pci_adapter);
-
int rpaphp_unconfig_pci_adapter(struct pci_bus *bus)
{
struct pci_dev *dev, *tmp;

2005-10-06 23:44:46

by linas

[permalink] [raw]
Subject: [PATCH 13/22] ppc64: RPAPHP duplicated code removal


13-rpaphp-eliminate-dupe-code.patch

The RPAPHP code contains two routines that appear to be gratiuitous copies
of very similar pci code. In particular,

rpaphp_claim_resource ~~ pci_claim_resource
rpadlpar_claim_one_bus == pcibios_claim_one_bus

This patch removes the rpaphp versions of the code.
This patch survived an overnight run of thousands of
add/remove of the slots and phb.

Signed-off-by: Linas Vepstas <[email protected]>


Index: linux-2.6.14-rc2-git6/drivers/pci/hotplug/rpaphp_pci.c
===================================================================
--- linux-2.6.14-rc2-git6.orig/drivers/pci/hotplug/rpaphp_pci.c 2005-10-06 17:53:55.542114371 -0500
+++ linux-2.6.14-rc2-git6/drivers/pci/hotplug/rpaphp_pci.c 2005-10-06 17:53:57.832792967 -0500
@@ -60,28 +60,6 @@
}
EXPORT_SYMBOL_GPL(rpaphp_find_pci_bus);

-int rpaphp_claim_resource(struct pci_dev *dev, int resource)
-{
- struct resource *res = &dev->resource[resource];
- struct resource *root = pci_find_parent_resource(dev, res);
- char *dtype = resource < PCI_BRIDGE_RESOURCES ? "device" : "bridge";
- int err = -EINVAL;
-
- if (root != NULL) {
- err = request_resource(root, res);
- }
-
- if (err) {
- err("PCI: %s region %d of %s %s [%lx:%lx]\n",
- root ? "Address space collision on" :
- "No parent found for",
- resource, dtype, pci_name(dev), res->start, res->end);
- }
- return err;
-}
-
-EXPORT_SYMBOL_GPL(rpaphp_claim_resource);
-
static int rpaphp_get_sensor_state(struct slot *slot, int *state)
{
int rc;
@@ -176,7 +154,7 @@

if (r->parent || !r->start || !r->flags)
continue;
- rpaphp_claim_resource(dev, i);
+ pci_claim_resource(dev, i);
}
}
}
Index: linux-2.6.14-rc2-git6/drivers/pci/hotplug/rpadlpar_core.c
===================================================================
--- linux-2.6.14-rc2-git6.orig/drivers/pci/hotplug/rpadlpar_core.c 2005-10-06 17:53:50.226860151 -0500
+++ linux-2.6.14-rc2-git6/drivers/pci/hotplug/rpadlpar_core.c 2005-10-06 17:53:57.834792686 -0500
@@ -109,28 +109,6 @@
return NULL;
}

-static void rpadlpar_claim_one_bus(struct pci_bus *b)
-{
- struct list_head *ld;
- struct pci_bus *child_bus;
-
- for (ld = b->devices.next; ld != &b->devices; ld = ld->next) {
- struct pci_dev *dev = pci_dev_b(ld);
- int i;
-
- for (i = 0; i < PCI_NUM_RESOURCES; i++) {
- struct resource *r = &dev->resource[i];
-
- if (r->parent || !r->start || !r->flags)
- continue;
- rpaphp_claim_resource(dev, i);
- }
- }
-
- list_for_each_entry(child_bus, &b->children, node)
- rpadlpar_claim_one_bus(child_bus);
-}
-
static int pci_add_secondary_bus(struct device_node *dn,
struct pci_dev *bridge_dev)
{
@@ -155,7 +133,7 @@
pcibios_fixup_bus(child);

/* Claim new bus resources */
- rpadlpar_claim_one_bus(bridge_dev->bus);
+ pcibios_claim_one_bus(bridge_dev->bus);

if (hose->last_busno < child->number)
hose->last_busno = child->number;
Index: linux-2.6.14-rc2-git6/arch/ppc64/kernel/pci.c
===================================================================
--- linux-2.6.14-rc2-git6.orig/arch/ppc64/kernel/pci.c 2005-10-06 17:50:25.899529261 -0500
+++ linux-2.6.14-rc2-git6/arch/ppc64/kernel/pci.c 2005-10-06 17:53:57.836792405 -0500
@@ -198,7 +198,7 @@
spin_unlock(&hose_spinlock);
}

-static void __init pcibios_claim_one_bus(struct pci_bus *b)
+void __devinit pcibios_claim_one_bus(struct pci_bus *b)
{
struct pci_dev *dev;
struct pci_bus *child_bus;
Index: linux-2.6.14-rc2-git6/include/asm-ppc64/pci.h
===================================================================
--- linux-2.6.14-rc2-git6.orig/include/asm-ppc64/pci.h 2005-10-06 17:50:25.899529261 -0500
+++ linux-2.6.14-rc2-git6/include/asm-ppc64/pci.h 2005-10-06 17:53:57.836792405 -0500
@@ -160,6 +160,8 @@
extern void
pcibios_fixup_device_resources(struct pci_dev *dev, struct pci_bus *bus);

+extern void pcibios_claim_one_bus(struct pci_bus *b);
+
extern struct pci_controller *init_phb_dynamic(struct device_node *dn);

extern int pci_read_irq_line(struct pci_dev *dev);

2005-10-06 23:46:27

by linas

[permalink] [raw]
Subject: [PATCH 14/22] ppc64: RPA PHP to EEH code movement


14-rpaphp-migrate.patch

This patch moves some pci device add & remove code from the PCI
hotplug directory to the arch/ppc64/kernel directory, and cleans
it up a tad. The primary reason for this is that the code performs
some fairly generic operations that are shared with the PCI error
recovery code (living in the arch/ppc64/kernel directory).

Signed-off-by: Linas Vepstas <[email protected]>

Index: linux-2.6.14-rc2-git6/arch/ppc64/kernel/pci_dlpar.c
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.14-rc2-git6/arch/ppc64/kernel/pci_dlpar.c 2005-10-06 17:54:00.306445890 -0500
@@ -0,0 +1,174 @@
+/*
+ * PCI Dynamic LPAR, PCI Hot Plug and PCI EEH recovery code
+ * for RPA-compliant PPC64 platform.
+ * Copyright (C) 2003 Linda Xie <[email protected]>
+ * Copyright (C) 2005 International Business Machines
+ *
+ * Updates, 2005, John Rose <[email protected]>
+ * Updates, 2005, Linas Vepstas <[email protected]>
+ *
+ * All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or (at
+ * your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or
+ * NON INFRINGEMENT. See the GNU General Public License for more
+ * details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+
+#include <linux/pci.h>
+#include <asm/pci-bridge.h>
+
+static struct pci_bus *
+find_bus_among_children(struct pci_bus *bus,
+ struct device_node *dn)
+{
+ struct pci_bus *child = NULL;
+ struct list_head *tmp;
+ struct device_node *busdn;
+
+ busdn = pci_bus_to_OF_node(bus);
+ if (busdn == dn)
+ return bus;
+
+ list_for_each(tmp, &bus->children) {
+ child = find_bus_among_children(pci_bus_b(tmp), dn);
+ if (child)
+ break;
+ };
+ return child;
+}
+
+struct pci_bus *
+pcibios_find_pci_bus(struct device_node *dn)
+{
+ struct pci_dn *pdn = dn->data;
+
+ if (!pdn || !pdn->phb || !pdn->phb->bus)
+ return NULL;
+
+ return find_bus_among_children(pdn->phb->bus, dn);
+}
+
+/**
+ * pcibios_remove_pci_devices - remove all devices under this bus
+ *
+ * Remove all of the PCI devices under this bus both from the
+ * linux pci device tree, and from the ppc64 EEH address cache.
+ */
+void
+pcibios_remove_pci_devices(struct pci_bus *bus)
+{
+ struct pci_dev *dev, *tmp;
+
+ list_for_each_entry_safe(dev, tmp, &bus->devices, bus_list) {
+ eeh_remove_bus_device(dev);
+ pci_remove_bus_device(dev);
+ }
+}
+
+/* Must be called before pci_bus_add_devices */
+static void
+pcibios_fixup_new_pci_devices(struct pci_bus *bus, int fix_bus)
+{
+ struct pci_dev *dev;
+
+ list_for_each_entry(dev, &bus->devices, bus_list) {
+ /*
+ * Skip already-present devices (which are on the
+ * global device list.)
+ */
+ if (list_empty(&dev->global_list)) {
+ int i;
+
+ /* Need to setup IOMMU tables */
+ ppc_md.iommu_dev_setup(dev);
+
+ if(fix_bus)
+ pcibios_fixup_device_resources(dev, bus);
+ pci_read_irq_line(dev);
+ for (i = 0; i < PCI_NUM_RESOURCES; i++) {
+ struct resource *r = &dev->resource[i];
+
+ if (r->parent || !r->start || !r->flags)
+ continue;
+ pci_claim_resource(dev, i);
+ }
+ }
+ }
+}
+
+static int
+pcibios_pci_config_bridge(struct pci_dev *dev)
+{
+ u8 sec_busno;
+ struct pci_bus *child_bus;
+ struct pci_dev *child_dev;
+
+ /* Get busno of downstream bus */
+ pci_read_config_byte(dev, PCI_SECONDARY_BUS, &sec_busno);
+
+ /* Add to children of PCI bridge dev->bus */
+ child_bus = pci_add_new_bus(dev->bus, dev, sec_busno);
+ if (!child_bus) {
+ printk (KERN_ERR "%s: could not add second bus\n", __FUNCTION__);
+ return -EIO;
+ }
+ sprintf(child_bus->name, "PCI Bus #%02x", child_bus->number);
+
+ pci_scan_child_bus(child_bus);
+
+ list_for_each_entry(child_dev, &child_bus->devices, bus_list) {
+ eeh_add_device_late(child_dev);
+ }
+
+ /* Fixup new pci devices without touching bus struct */
+ pcibios_fixup_new_pci_devices(child_bus, 0);
+
+ /* Make the discovered devices available */
+ pci_bus_add_devices(child_bus);
+ return 0;
+}
+
+/**
+ * pcibios_add_pci_devices - adds new pci devices to bus
+ *
+ * This routine will find and fixup new pci devices under
+ * the indicated bus. This routine presumes that there
+ * might already be some devices under this pridge, so
+ * it carefully treis o add only new devices. (And that
+ * is how this routine differes from other, similar pcibios
+ * routines.)
+ */
+void
+pcibios_add_pci_devices(struct pci_bus * bus)
+{
+ int slotno, num;
+ struct pci_dev *dev;
+ struct device_node *dn = pci_bus_to_OF_node(bus);
+
+ eeh_add_device_tree_early(dn);
+
+ /* pci_scan_slot should find all children */
+ slotno = PCI_SLOT(PCI_DN(dn->child)->devfn);
+ num = pci_scan_slot(bus, PCI_DEVFN(slotno, 0));
+ if (num) {
+ pcibios_fixup_new_pci_devices(bus, 1);
+ pci_bus_add_devices(bus);
+ }
+
+ list_for_each_entry(dev, &bus->devices, bus_list) {
+ eeh_add_device_late (dev);
+ if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE)
+ pcibios_pci_config_bridge(dev);
+ }
+}
Index: linux-2.6.14-rc2-git6/arch/ppc64/kernel/Makefile
===================================================================
--- linux-2.6.14-rc2-git6.orig/arch/ppc64/kernel/Makefile 2005-10-06 17:50:25.365604176 -0500
+++ linux-2.6.14-rc2-git6/arch/ppc64/kernel/Makefile 2005-10-06 17:54:00.307445749 -0500
@@ -37,7 +37,7 @@
bpa_iic.o spider-pic.o

obj-$(CONFIG_KEXEC) += machine_kexec.o
-obj-$(CONFIG_EEH) += eeh.o eeh_event.o
+obj-$(CONFIG_EEH) += eeh.o eeh_event.o pci_dlpar.o
obj-$(CONFIG_PROC_FS) += proc_ppc64.o
obj-$(CONFIG_RTAS_FLASH) += rtas_flash.o
obj-$(CONFIG_SMP) += smp.o
Index: linux-2.6.14-rc2-git6/drivers/pci/hotplug/rpaphp_pci.c
===================================================================
--- linux-2.6.14-rc2-git6.orig/drivers/pci/hotplug/rpaphp_pci.c 2005-10-06 17:53:57.832792967 -0500
+++ linux-2.6.14-rc2-git6/drivers/pci/hotplug/rpaphp_pci.c 2005-10-06 17:54:00.308445609 -0500
@@ -30,36 +30,6 @@

#include "rpaphp.h"

-static struct pci_bus *find_bus_among_children(struct pci_bus *bus,
- struct device_node *dn)
-{
- struct pci_bus *child = NULL;
- struct list_head *tmp;
- struct device_node *busdn;
-
- busdn = pci_bus_to_OF_node(bus);
- if (busdn == dn)
- return bus;
-
- list_for_each(tmp, &bus->children) {
- child = find_bus_among_children(pci_bus_b(tmp), dn);
- if (child)
- break;
- }
- return child;
-}
-
-struct pci_bus *rpaphp_find_pci_bus(struct device_node *dn)
-{
- struct pci_dn *pdn = dn->data;
-
- if (!pdn || !pdn->phb || !pdn->phb->bus)
- return NULL;
-
- return find_bus_among_children(pdn->phb->bus, dn);
-}
-EXPORT_SYMBOL_GPL(rpaphp_find_pci_bus);
-
static int rpaphp_get_sensor_state(struct slot *slot, int *state)
{
int rc;
@@ -118,7 +88,7 @@
/* config/unconfig adapter */
*value = slot->state;
} else {
- bus = rpaphp_find_pci_bus(slot->dn);
+ bus = pcibios_find_pci_bus(slot->dn);
if (bus && !list_empty(&bus->devices))
*value = CONFIGURED;
else
@@ -129,117 +99,6 @@
return rc;
}

-/* Must be called before pci_bus_add_devices */
-static void
-rpaphp_fixup_new_pci_devices(struct pci_bus *bus, int fix_bus)
-{
- struct pci_dev *dev;
-
- list_for_each_entry(dev, &bus->devices, bus_list) {
- /*
- * Skip already-present devices (which are on the
- * global device list.)
- */
- if (list_empty(&dev->global_list)) {
- int i;
-
- /* Need to setup IOMMU tables */
- ppc_md.iommu_dev_setup(dev);
-
- if(fix_bus)
- pcibios_fixup_device_resources(dev, bus);
- pci_read_irq_line(dev);
- for (i = 0; i < PCI_NUM_RESOURCES; i++) {
- struct resource *r = &dev->resource[i];
-
- if (r->parent || !r->start || !r->flags)
- continue;
- pci_claim_resource(dev, i);
- }
- }
- }
-}
-
-static int rpaphp_pci_config_bridge(struct pci_dev *dev)
-{
- u8 sec_busno;
- struct pci_bus *child_bus;
- struct pci_dev *child_dev;
-
- dbg("Enter %s: BRIDGE dev=%s\n", __FUNCTION__, pci_name(dev));
-
- /* get busno of downstream bus */
- pci_read_config_byte(dev, PCI_SECONDARY_BUS, &sec_busno);
-
- /* add to children of PCI bridge dev->bus */
- child_bus = pci_add_new_bus(dev->bus, dev, sec_busno);
- if (!child_bus) {
- err("%s: could not add second bus\n", __FUNCTION__);
- return -EIO;
- }
- sprintf(child_bus->name, "PCI Bus #%02x", child_bus->number);
- /* do pci_scan_child_bus */
- pci_scan_child_bus(child_bus);
-
- list_for_each_entry(child_dev, &child_bus->devices, bus_list) {
- eeh_add_device_late(child_dev);
- }
-
- /* fixup new pci devices without touching bus struct */
- rpaphp_fixup_new_pci_devices(child_bus, 0);
-
- /* Make the discovered devices available */
- pci_bus_add_devices(child_bus);
- return 0;
-}
-
-/*****************************************************************************
- rpaphp_pci_config_slot() will configure all devices under the
- given slot->dn and return the the first pci_dev.
- *****************************************************************************/
-int
-rpaphp_config_pci_adapter(struct pci_bus *bus)
-{
- struct device_node *dn = pci_bus_to_OF_node(bus);
- struct pci_dev *dev = NULL;
- int rc = -ENODEV;
- int slotno;
- int num;
-
- dbg("Enter %s: dn=%s bus=%s\n", __FUNCTION__, dn->full_name, bus->name);
- if (!dn || !dn->child)
- goto exit;
-
- eeh_add_device_tree_early(dn);
-
- slotno = PCI_SLOT(PCI_DN(dn->child)->devfn);
-
- /* pci_scan_slot should find all children */
- num = pci_scan_slot(bus, PCI_DEVFN(slotno, 0));
- if (num) {
- rpaphp_fixup_new_pci_devices(bus, 1);
- pci_bus_add_devices(bus);
- }
- if (list_empty(&bus->devices)) {
- err("%s: No new device found\n", __FUNCTION__);
- goto exit;
- }
- list_for_each_entry(dev, &bus->devices, bus_list) {
- if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE)
- rpaphp_pci_config_bridge(dev);
- }
-
- dbg("%s: pci_devs of slot[%s]\n", __FUNCTION__, dn->full_name);
- list_for_each_entry (dev, &bus->devices, bus_list)
- dbg("\t%s\n", pci_name(dev));
-
- rc = 0;
-exit:
- dbg("Exit %s: rc=%d\n", __FUNCTION__, rc);
- return rc;
-}
-EXPORT_SYMBOL_GPL(rpaphp_config_pci_adapter);
-
static void print_slot_pci_funcs(struct pci_bus *bus)
{
struct device_node *dn;
@@ -255,17 +114,6 @@
return;
}

-int rpaphp_unconfig_pci_adapter(struct pci_bus *bus)
-{
- struct pci_dev *dev, *tmp;
-
- list_for_each_entry_safe(dev, tmp, &bus->devices, bus_list) {
- eeh_remove_bus_device(dev);
- pci_remove_bus_device(dev);
- }
- return 0;
-}
-
static int setup_pci_hotplug_slot_info(struct slot *slot)
{
dbg("%s Initilize the PCI slot's hotplug->info structure ...\n",
@@ -301,7 +149,7 @@
struct pci_bus *bus;

BUG_ON(!dn);
- bus = rpaphp_find_pci_bus(dn);
+ bus = pcibios_find_pci_bus(dn);
if (!bus) {
err("%s: no pci_bus for dn %s\n", __FUNCTION__, dn->full_name);
goto exit_rc;
@@ -326,10 +174,7 @@
if (slot->hotplug_slot->info->adapter_status == NOT_CONFIGURED) {
dbg("%s CONFIGURING pci adapter in slot[%s]\n",
__FUNCTION__, slot->name);
- if (rpaphp_config_pci_adapter(slot->bus)) {
- err("%s: CONFIG pci adapter failed\n", __FUNCTION__);
- goto exit_rc;
- }
+ pcibios_add_pci_devices(slot->bus);

} else if (slot->hotplug_slot->info->adapter_status != CONFIGURED) {
err("%s: slot[%s]'s adapter_status is NOT_VALID.\n",
@@ -375,16 +220,10 @@
/* if slot is not empty, enable the adapter */
if (state == PRESENT) {
dbg("%s : slot[%s] is occupied.\n", __FUNCTION__, slot->name);
- retval = rpaphp_config_pci_adapter(slot->bus);
- if (!retval) {
- slot->state = CONFIGURED;
- dbg("%s: PCI devices in slot[%s] has been configured\n",
+ pcibios_add_pci_devices(slot->bus);
+ slot->state = CONFIGURED;
+ dbg("%s: PCI devices in slot[%s] has been configured\n",
__FUNCTION__, slot->name);
- } else {
- slot->state = NOT_CONFIGURED;
- dbg("%s: no pci_dev struct for adapter in slot[%s]\n",
- __FUNCTION__, slot->name);
- }
} else if (state == EMPTY) {
dbg("%s : slot[%s] is empty\n", __FUNCTION__, slot->name);
slot->state = EMPTY;
Index: linux-2.6.14-rc2-git6/drivers/pci/hotplug/rpadlpar_core.c
===================================================================
--- linux-2.6.14-rc2-git6.orig/drivers/pci/hotplug/rpadlpar_core.c 2005-10-06 17:53:57.834792686 -0500
+++ linux-2.6.14-rc2-git6/drivers/pci/hotplug/rpadlpar_core.c 2005-10-06 17:54:00.309445469 -0500
@@ -194,9 +194,8 @@
static int dlpar_add_pci_slot(char *drc_name, struct device_node *dn)
{
struct pci_dev *dev;
- int rc;

- if (rpaphp_find_pci_bus(dn))
+ if (pcibios_find_pci_bus(dn))
return -EINVAL;

/* Add pci bus */
@@ -208,12 +207,7 @@
}

if (dn->child) {
- rc = rpaphp_config_pci_adapter(dev->subordinate);
- if (rc < 0) {
- printk(KERN_ERR "%s: unable to enable slot %s\n",
- __FUNCTION__, drc_name);
- return -EIO;
- }
+ pcibios_add_pci_devices(dev->subordinate);
}

/* Add hotplug slot */
@@ -252,7 +246,7 @@
struct pci_dn *pdn;
int rc = 0;

- if (!rpaphp_find_pci_bus(dn))
+ if (!pcibios_find_pci_bus(dn))
return -EINVAL;

slot = find_slot(dn);
@@ -397,7 +391,7 @@
struct pci_bus *bus;
struct slot *slot;

- bus = rpaphp_find_pci_bus(dn);
+ bus = pcibios_find_pci_bus(dn);
if (!bus)
return -EINVAL;

Index: linux-2.6.14-rc2-git6/drivers/pci/hotplug/rpaphp_core.c
===================================================================
--- linux-2.6.14-rc2-git6.orig/drivers/pci/hotplug/rpaphp_core.c 2005-10-06 17:50:25.366604035 -0500
+++ linux-2.6.14-rc2-git6/drivers/pci/hotplug/rpaphp_core.c 2005-10-06 17:54:00.310445328 -0500
@@ -426,7 +426,8 @@

dbg("DISABLING SLOT %s\n", slot->name);
down(&rpaphp_sem);
- retval = rpaphp_unconfig_pci_adapter(slot->bus);
+ pcibios_remove_pci_devices(slot->bus);
+ retval = 0;
up(&rpaphp_sem);
slot->state = NOT_CONFIGURED;
info("%s: devices in slot[%s] unconfigured.\n", __FUNCTION__,
Index: linux-2.6.14-rc2-git6/include/asm-ppc64/pci-bridge.h
===================================================================
--- linux-2.6.14-rc2-git6.orig/include/asm-ppc64/pci-bridge.h 2005-10-06 17:50:25.366604035 -0500
+++ linux-2.6.14-rc2-git6/include/asm-ppc64/pci-bridge.h 2005-10-06 17:54:00.310445328 -0500
@@ -103,9 +103,18 @@
return bus->sysdata; /* Must be root bus (PHB) */
}

+/** Find the bus corresponding to the indicated device node */
+struct pci_bus * pcibios_find_pci_bus(struct device_node *dn);
+
extern void pci_process_bridge_OF_ranges(struct pci_controller *hose,
struct device_node *dev);

+/** Remove all of the PCI devices under this bus */
+void pcibios_remove_pci_devices(struct pci_bus *bus);
+
+/** Discover new pci devices under this bus, and add them */
+void pcibios_add_pci_devices(struct pci_bus * bus);
+
extern int pcibios_remove_root_bus(struct pci_controller *phb);

extern void phbs_remap_io(void);

2005-10-06 23:47:45

by linas

[permalink] [raw]
Subject: [PATCH 15/22] ppc64: PCI Error Recovery: PPC64 core recovery routines


PCI Error Recovery: PPC64 core recovery routines

Various PCI bus errors can be signaled by newer PCI controllers. The
core error recovery routines are architecture dependent. This patch adds
a recovery infrastructure for the PPC64 pSeries systems.

Signed-off-by: Linas Vepstas <[email protected]>

Index: linux-2.6.14-rc2-git6/arch/ppc64/kernel/eeh.c
===================================================================
--- linux-2.6.14-rc2-git6.orig/arch/ppc64/kernel/eeh.c 2005-10-06 17:53:52.475544639 -0500
+++ linux-2.6.14-rc2-git6/arch/ppc64/kernel/eeh.c 2005-10-06 17:54:14.494455177 -0500
@@ -486,6 +486,11 @@
if (PCI_DN(dn)) {
PCI_DN(dn)->eeh_mode |= mode_flag;

+ /* Mark the pci device driver too */
+ struct pci_dev *dev = PCI_DN(dn)->pcidev;
+ if (dev && dev->driver)
+ dev->error_state = pci_channel_io_frozen;
+
if (dn->child)
__eeh_mark_slot (dn->child, mode_flag);
}
@@ -545,6 +550,7 @@
int rets[3];
unsigned long flags;
struct pci_dn *pdn;
+ enum pci_channel_state state;
int rc = 0;

__get_cpu_var(total_mmio_ffs)++;
@@ -649,8 +655,13 @@
eeh_mark_slot (dn, EEH_MODE_ISOLATED);
spin_unlock_irqrestore(&confirm_error_lock, flags);

- eeh_send_failure_event (dn, dev, rets[0], rets[2]);
-
+ state = pci_channel_io_normal;
+ if ((rets[0] == 2) || (rets[0] == 4))
+ state = pci_channel_io_frozen;
+ if (rets[0] == 5)
+ state = pci_channel_io_perm_failure;
+ eeh_send_failure_event (dn, dev, state, rets[2]);
+
/* Most EEH events are due to device driver bugs. Having
* a stack trace will help the device-driver authors figure
* out what happened. So print that out. */
@@ -954,8 +965,10 @@
* But there are a few cases like display devices that make sense.
*/
enable = 1; /* i.e. we will do checking */
+#if 0
if ((*class_code >> 16) == PCI_BASE_CLASS_DISPLAY)
enable = 0;
+#endif

if (!enable)
pdn->eeh_mode |= EEH_MODE_NOCHECK;
Index: linux-2.6.14-rc2-git6/arch/ppc64/kernel/eeh_driver.c
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.14-rc2-git6/arch/ppc64/kernel/eeh_driver.c 2005-10-06 17:54:14.495455037 -0500
@@ -0,0 +1,376 @@
+/*
+ * PCI Error Recovery Driver for RPA-compliant PPC64 platform.
+ * Copyright (C) 2004, 2005 Linas Vepstas <[email protected]>
+ *
+ * All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or (at
+ * your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or
+ * NON INFRINGEMENT. See the GNU General Public License for more
+ * details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ *
+ * Send feedback to <[email protected]>
+ *
+ */
+#include <linux/delay.h>
+#include <linux/irq.h>
+#include <linux/interrupt.h>
+#include <linux/notifier.h>
+#include <linux/pci.h>
+#include <asm/eeh.h>
+#include <asm/pci-bridge.h>
+#include <asm/prom.h>
+#include <asm/rtas.h>
+
+#include "eeh_event.h"
+#include "pci.h"
+
+static inline const char * pcid_name (struct pci_dev *pdev)
+{
+ if (pdev->dev.driver)
+ return pdev->dev.driver->name;
+ return "";
+}
+
+/**
+ * Return the "partitionable endpoint" (pe) under which this device lies
+ */
+static struct device_node * find_device_pe(struct device_node *dn)
+{
+ while ((dn->parent) && PCI_DN(dn->parent) &&
+ (PCI_DN(dn->parent)->eeh_mode & EEH_MODE_SUPPORTED)) {
+ dn = dn->parent;
+ }
+ return dn;
+}
+
+
+#ifdef DEBUG
+static void print_device_node_tree (struct pci_dn *pdn, int dent)
+{
+ int i;
+ if (!pdn) return;
+ for (i=0;i<dent; i++)
+ printk(" ");
+ printk("dn=%s mode=%x \tcfg_addr=%x pe_addr=%x \tfull=%s\n",
+ pdn->node->name, pdn->eeh_mode, pdn->eeh_config_addr,
+ pdn->eeh_pe_config_addr, pdn->node->full_name);
+ dent += 3;
+ struct device_node *pc = pdn->node->child;
+ while (pc) {
+ print_device_node_tree(PCI_DN(pc), dent);
+ pc = pc->sibling;
+ }
+}
+#endif
+
+/**
+ * irq_in_use - return true if this irq is being used
+ */
+static int irq_in_use(unsigned int irq)
+{
+ int rc = 0;
+ unsigned long flags;
+ struct irq_desc *desc = irq_desc + irq;
+
+ spin_lock_irqsave(&desc->lock, flags);
+ if (desc->action)
+ rc = 1;
+ spin_unlock_irqrestore(&desc->lock, flags);
+ return rc;
+}
+
+/* ------------------------------------------------------- */
+/** eeh_report_error - report an EEH error to each device,
+ * collect up and merge the device responses.
+ */
+
+static void eeh_report_error(struct pci_dev *dev, void *userdata)
+{
+ enum pcierr_result rc, *res = userdata;
+ struct pci_driver *driver = dev->driver;
+
+ dev->error_state = pci_channel_io_frozen;
+
+ if (!driver)
+ return;
+
+ if (irq_in_use (dev->irq)) {
+ struct device_node *dn = pci_device_to_OF_node(dev);
+ PCI_DN(dn)->eeh_mode |= EEH_MODE_IRQ_DISABLED;
+ disable_irq_nosync(dev->irq);
+ }
+ if (!driver->err_handler)
+ return;
+ if (!driver->err_handler->error_detected)
+ return;
+
+ rc = driver->err_handler->error_detected (dev, pci_channel_io_frozen);
+ if (*res == PCIERR_RESULT_NONE) *res = rc;
+ if (*res == PCIERR_RESULT_NEED_RESET) return;
+ if (*res == PCIERR_RESULT_DISCONNECT &&
+ rc == PCIERR_RESULT_NEED_RESET) *res = rc;
+}
+
+/** eeh_report_reset -- tell this device that the pci slot
+ * has been reset.
+ */
+
+static void eeh_report_reset(struct pci_dev *dev, void *userdata)
+{
+ struct pci_driver *driver = dev->driver;
+ struct device_node *dn = pci_device_to_OF_node(dev);
+
+ if (!driver)
+ return;
+
+ if ((PCI_DN(dn)->eeh_mode) & EEH_MODE_IRQ_DISABLED) {
+ PCI_DN(dn)->eeh_mode &= ~EEH_MODE_IRQ_DISABLED;
+ enable_irq(dev->irq);
+ }
+ if (!driver->err_handler)
+ return;
+ if (!driver->err_handler->slot_reset)
+ return;
+
+ driver->err_handler->slot_reset(dev);
+}
+
+static void eeh_report_resume(struct pci_dev *dev, void *userdata)
+{
+ struct pci_driver *driver = dev->driver;
+
+ dev->error_state = pci_channel_io_normal;
+
+ if (!driver)
+ return;
+ if (!driver->err_handler)
+ return;
+ if (!driver->err_handler->resume)
+ return;
+
+ driver->err_handler->resume(dev);
+}
+
+static void eeh_report_failure(struct pci_dev *dev, void *userdata)
+{
+ struct pci_driver *driver = dev->driver;
+
+ dev->error_state = pci_channel_io_perm_failure;
+
+ if (!driver)
+ return;
+
+ if (irq_in_use (dev->irq)) {
+ struct device_node *dn = pci_device_to_OF_node(dev);
+ PCI_DN(dn)->eeh_mode |= EEH_MODE_IRQ_DISABLED;
+ disable_irq_nosync(dev->irq);
+ }
+ if (!driver->err_handler)
+ return;
+ if (!driver->err_handler->error_detected)
+ return;
+ driver->err_handler->error_detected(dev, pci_channel_io_perm_failure);
+}
+
+/* ------------------------------------------------------- */
+/**
+ * handle_eeh_events -- reset a PCI device after hard lockup.
+ *
+ * pSeries systems will isolate a PCI slot if the PCI-Host
+ * bridge detects address or data parity errors, DMA's
+ * occuring to wild addresses (which usually happen due to
+ * bugs in device drivers or in PCI adapter firmware).
+ * Slot isolations also occur if #SERR, #PERR or other misc
+ * PCI-related errors are detected.
+ *
+ * Recovery process consists of unplugging the device driver
+ * (which generated hotplug events to userspace), then issuing
+ * a PCI #RST to the device, then reconfiguring the PCI config
+ * space for all bridges & devices under this slot, and then
+ * finally restarting the device drivers (which cause a second
+ * set of hotplug events to go out to userspace).
+ */
+
+/**
+ * eeh_reset_device() -- perform actual reset of a pci slot
+ * Args: bus: pointer to the pci bus structure corresponding
+ * to the isolated slot. A non-null value will
+ * cause all devices under the bus to be removed
+ * and then re-added.
+ * pe_dn: pointer to a "Partionable Endpoint" device node.
+ * This is the top-level structure on which pci
+ * bus resets can be performed.
+ */
+
+static void eeh_reset_device (struct pci_dn *pe_dn, struct pci_bus *bus)
+{
+ if (bus)
+ pcibios_remove_pci_devices(bus);
+
+ /* Reset the pci controller. (Asserts RST#; resets config space).
+ * Reconfigure bridges and devices */
+ rtas_set_slot_reset(pe_dn);
+
+ /* Walk over all functions on this device */
+ rtas_configure_bridge(pe_dn);
+ eeh_restore_bars(pe_dn);
+
+ /* Give the system 5 seconds to finish running the user-space
+ * hotplug shutdown scripts, e.g. ifdown for ethernet. Yes,
+ * this is a hack, but if we don't do this, and try to bring
+ * the device up before the scripts have taken it down,
+ * potentially weird things happen.
+ */
+ if (bus) {
+ ssleep (5);
+ pcibios_add_pci_devices(bus);
+ }
+}
+
+/* The longest amount of time to wait for a pci device
+ * to come back on line, in seconds.
+ */
+#define MAX_WAIT_FOR_RECOVERY 15
+
+void handle_eeh_events (struct eeh_event *event)
+{
+ struct device_node *frozen_dn;
+ struct pci_dn *frozen_pdn;
+ struct pci_bus *frozen_bus;
+ struct pci_dev *dev = event->dev;
+ int perm_failure = 0;
+
+ /* We might not have a pci device, if it was a config space read
+ * that failed. Find the pci device now. */
+ if (!dev) {
+ while ((dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, dev)) != NULL) {
+ if (pci_device_to_OF_node(dev) == event->dn)
+ break;
+ }
+ }
+
+ frozen_dn = find_device_pe(event->dn);
+ frozen_bus = pcibios_find_pci_bus(frozen_dn);
+
+ if (!frozen_dn) {
+ printk(KERN_ERR "EEH: Cannot find PCI controller for %s\n",
+ pci_name(dev));
+ return;
+ }
+
+ /* There are two different styles for coming up with the PE.
+ * In the old style, it was the highest EEH-capable device
+ * which was always an EADS pci bridge. In the new style,
+ * there might not be any EADS bridges, and even when there are,
+ * the firmware marks them as "EEH incapable". So another
+ * two-step is needed to find the pci bus.. */
+ if (!frozen_bus)
+ frozen_bus = pcibios_find_pci_bus (frozen_dn->parent);
+
+ if (!frozen_bus) {
+ printk(KERN_ERR "EEH: Cannot find PCI bus for %s\n",
+ frozen_dn->full_name);
+ return;
+ }
+
+ if (!dev)
+ dev = frozen_bus->self;
+
+#if 0
+ /* We may get "permanent failure" messages on empty slots.
+ * These are false alarms. Empty slots have no child dn. */
+ if ((event->state == pci_channel_io_perm_failure) && (frozen_device == NULL))
+ return;
+#endif
+
+ frozen_pdn = PCI_DN(frozen_dn);
+ frozen_pdn->eeh_freeze_count++;
+
+ if (frozen_pdn->eeh_freeze_count > EEH_MAX_ALLOWED_FREEZES)
+ perm_failure = 1;
+
+ /* If the reset state is a '5' and the time to reset is 0 (infinity)
+ * or is more then 15 seconds, then mark this as a permanent failure.
+ */
+ if ((event->state == pci_channel_io_perm_failure) &&
+ ((event->time_unavail <= 0) ||
+ (event->time_unavail > MAX_WAIT_FOR_RECOVERY*1000)))
+ {
+ perm_failure = 1;
+ }
+
+ /* Log the error with the rtas logger. */
+ if (perm_failure) {
+ /*
+ * About 90% of all real-life EEH failures in the field
+ * are due to poorly seated PCI cards. Only 10% or so are
+ * due to actual, failed cards.
+ */
+ printk(KERN_ERR
+ "EEH: PCI device %s - %s has failed %d times \n"
+ "and has been permanently disabled. Please try reseating\n"
+ "this device or replacing it.\n",
+ pci_name (dev), pcid_name(dev), frozen_pdn->eeh_freeze_count);
+
+ eeh_slot_error_detail(frozen_pdn, 2 /* Permanent Error */);
+
+ /* Notify all devices that they're about to go down. */
+ pci_walk_bus(frozen_bus, eeh_report_failure, 0);
+
+ /* Shut down the device drivers for good. */
+ pcibios_remove_pci_devices(frozen_bus);
+ return;
+ }
+
+ eeh_slot_error_detail(frozen_pdn, 1 /* Temporary Error */);
+ printk(KERN_WARNING
+ "EEH: This PCI device has failed %d times since last reboot: %s - %s\n",
+ frozen_pdn->eeh_freeze_count,
+ pci_name (dev), pcid_name(dev));
+
+ /* Walk the various device drivers attached to this slot through
+ * a reset sequence, giving each an opportunity to do what it needs
+ * to accomplish the reset. Each child gets a report of the
+ * status ... if any child can't handle the reset, then the entire
+ * slot is dlpar removed and added.
+ */
+ enum pcierr_result result = PCIERR_RESULT_NONE;
+ pci_walk_bus(frozen_bus, eeh_report_error, &result);
+
+ /* If all device drivers were EEH-unaware, then shut
+ * down all of the device drivers, and hope they
+ * go down willingly, without panicing the system.
+ */
+ if (result == PCIERR_RESULT_NONE) {
+ eeh_reset_device(frozen_pdn, frozen_bus);
+ }
+
+ /* If any device called out for a reset, then reset the slot */
+ if (result == PCIERR_RESULT_NEED_RESET) {
+ eeh_reset_device(frozen_pdn, NULL);
+ pci_walk_bus(frozen_bus, eeh_report_reset, 0);
+ }
+
+ /* If all devices reported they can proceed, the re-enable PIO */
+ if (result == PCIERR_RESULT_CAN_RECOVER) {
+ /* XXX Not supported; we brute-force reset the device */
+ eeh_reset_device(frozen_pdn, NULL);
+ pci_walk_bus(frozen_bus, eeh_report_reset, 0);
+ }
+
+ /* Tell all device drivers that they can resume operations */
+ pci_walk_bus(frozen_bus, eeh_report_resume, 0);
+}
+
+/* ---------- end of file ---------- */
Index: linux-2.6.14-rc2-git6/arch/ppc64/kernel/eeh_event.c
===================================================================
--- linux-2.6.14-rc2-git6.orig/arch/ppc64/kernel/eeh_event.c 2005-10-06 17:50:24.089783186 -0500
+++ linux-2.6.14-rc2-git6/arch/ppc64/kernel/eeh_event.c 2005-10-06 17:56:14.461622651 -0500
@@ -21,6 +21,7 @@
#include <linux/list.h>
#include <linux/pci.h>
#include "eeh_event.h"
+#include "pci.h"

/** Overview:
* EEH error states may be detected within exception handlers;
@@ -36,30 +37,7 @@
static void eeh_thread_launcher(void *);
DECLARE_WORK(eeh_event_wq, eeh_thread_launcher, NULL);

-/**
- * eeh_panic - call panic() for an eeh event that cannot be handled.
- * The philosophy of this routine is that it is better to panic and
- * halt the OS than it is to risk possible data corruption by
- * oblivious device drivers that don't know better.
- *
- * @dev pci device that had an eeh event
- * @reset_state current reset state of the device slot
- */
-static void eeh_panic(struct pci_dev *dev, int reset_state)
-{
- /*
- * Since the panic_on_oops sysctl is used to halt the system
- * in light of potential corruption, we can use it here.
- */
- if (panic_on_oops) {
- panic("EEH: MMIO failure (%d) on device:%s\n", reset_state,
- pci_name(dev));
- }
- else {
- printk(KERN_INFO "EEH: Ignored MMIO failure (%d) on device:%s\n",
- reset_state, pci_name(dev));
- }
-}
+int handle_eeh_events (struct eeh_event *event);

/**
* eeh_event_handler - dispatch EEH events. The detection of a frozen
@@ -82,10 +60,16 @@

spin_lock_irqsave(&eeh_eventlist_lock, flags);
event = NULL;
+
+ /* Unqueue the event, get ready to process. */
if (!list_empty(&eeh_eventlist)) {
event = list_entry(eeh_eventlist.next, struct eeh_event, list);
list_del(&event->list);
}
+
+ if (event)
+ eeh_mark_slot(event->dn, EEH_MODE_RECOVERING);
+
spin_unlock_irqrestore(&eeh_eventlist_lock, flags);
if (event == NULL)
break;
@@ -93,8 +77,11 @@
printk(KERN_INFO "EEH: Detected PCI bus error on device %s\n",
pci_name(event->dev));

- eeh_panic (event->dev, event->state);
+ handle_eeh_events(event);
+
+ eeh_clear_slot(event->dn, EEH_MODE_RECOVERING);

+ pci_dev_put(event->dev);
kfree(event);
}

@@ -122,7 +109,7 @@
*/
int eeh_send_failure_event (struct device_node *dn,
struct pci_dev *dev,
- int state,
+ enum pci_channel_state state,
int time_unavail)
{
unsigned long flags;
Index: linux-2.6.14-rc2-git6/arch/ppc64/kernel/eeh_event.h
===================================================================
--- linux-2.6.14-rc2-git6.orig/arch/ppc64/kernel/eeh_event.h 2005-10-06 17:50:24.089783186 -0500
+++ linux-2.6.14-rc2-git6/arch/ppc64/kernel/eeh_event.h 2005-10-06 17:54:14.496454897 -0500
@@ -29,7 +29,7 @@
struct list_head list;
struct device_node *dn; /* struct device node */
struct pci_dev *dev; /* affected device */
- int state;
+ enum pci_channel_state state; /* PCI bus state for the affected device */
int time_unavail; /* milliseconds until device might be available */
};

@@ -46,7 +46,7 @@
*/
int eeh_send_failure_event (struct device_node *dn,
struct pci_dev *dev,
- int reset_state,
+ enum pci_channel_state state,
int time_unavail);

#endif /* ASM_PPC64_EEH_EVENT_H */
Index: linux-2.6.14-rc2-git6/arch/ppc64/kernel/Makefile
===================================================================
--- linux-2.6.14-rc2-git6.orig/arch/ppc64/kernel/Makefile 2005-10-06 17:54:00.307445749 -0500
+++ linux-2.6.14-rc2-git6/arch/ppc64/kernel/Makefile 2005-10-06 17:54:14.496454897 -0500
@@ -37,7 +37,7 @@
bpa_iic.o spider-pic.o

obj-$(CONFIG_KEXEC) += machine_kexec.o
-obj-$(CONFIG_EEH) += eeh.o eeh_event.o pci_dlpar.o
+obj-$(CONFIG_EEH) += eeh.o eeh_driver.o eeh_event.o pci_dlpar.o
obj-$(CONFIG_PROC_FS) += proc_ppc64.o
obj-$(CONFIG_RTAS_FLASH) += rtas_flash.o
obj-$(CONFIG_SMP) += smp.o
Index: linux-2.6.14-rc2-git6/arch/ppc64/kernel/pci.h
===================================================================
--- linux-2.6.14-rc2-git6.orig/arch/ppc64/kernel/pci.h 2005-10-06 17:53:02.165603605 -0500
+++ linux-2.6.14-rc2-git6/arch/ppc64/kernel/pci.h 2005-10-06 17:54:14.497454757 -0500
@@ -54,6 +54,15 @@
/* ---- EEH internal-use-only related routines ---- */
#ifdef CONFIG_EEH
/**
+ * eeh_slot_error_detail -- record and EEH error condition to the log
+ * @severity: 1 if temporary, 2 if permanent failure.
+ *
+ * Obtains the the EEH error details from the RTAS subsystem,
+ * and then logs these details with the RTAS error log system.
+ */
+void eeh_slot_error_detail (struct pci_dn *pdn, int severity);
+
+/**
* rtas_set_slot_reset -- unfreeze a frozen slot
*
* Clear the EEH-frozen condition on a slot. This routine
Index: linux-2.6.14-rc2-git6/include/asm-ppc64/eeh.h
===================================================================
--- linux-2.6.14-rc2-git6.orig/include/asm-ppc64/eeh.h 2005-10-06 17:53:52.476544499 -0500
+++ linux-2.6.14-rc2-git6/include/asm-ppc64/eeh.h 2005-10-06 17:55:48.203306937 -0500
@@ -31,9 +31,11 @@
#ifdef CONFIG_EEH

/* Values for eeh_mode bits in device_node */
-#define EEH_MODE_SUPPORTED (1<<0)
-#define EEH_MODE_NOCHECK (1<<1)
-#define EEH_MODE_ISOLATED (1<<2)
+#define EEH_MODE_SUPPORTED (1<<0)
+#define EEH_MODE_NOCHECK (1<<1)
+#define EEH_MODE_ISOLATED (1<<2)
+#define EEH_MODE_RECOVERING (1<<3)
+#define EEH_MODE_IRQ_DISABLED (1<<4)

/* Max number of EEH freezes allowed before we consider the device
* to be permanently disabled. */

2005-10-06 23:53:23

by linas

[permalink] [raw]
Subject: [PATCH 16/22] PCI Address cache lookup code


16-pci-address-cache.patch

Architecture-independent PCI address caching code.
Performs caching and lookup of pci devices based on the
I/O addresses that they use. That is, given an I/O address,
this can be used to find the pci device that uses that address.
Although it currently lives in teh ppc64 directory, it
could potentially be common code.

This code used to live in the overly large EEH file.
This patch splits it out to its own file.

Signed-off-by: Linas Vepstas <[email protected]>

Index: linux-2.6.14-rc2-git6/arch/ppc64/kernel/Makefile
===================================================================
--- linux-2.6.14-rc2-git6.orig/arch/ppc64/kernel/Makefile 2005-10-06 17:54:14.496454897 -0500
+++ linux-2.6.14-rc2-git6/arch/ppc64/kernel/Makefile 2005-10-06 17:56:42.934627625 -0500
@@ -37,7 +37,7 @@
bpa_iic.o spider-pic.o

obj-$(CONFIG_KEXEC) += machine_kexec.o
-obj-$(CONFIG_EEH) += eeh.o eeh_driver.o eeh_event.o pci_dlpar.o
+obj-$(CONFIG_EEH) += eeh.o eeh_cache.o eeh_driver.o eeh_event.o pci_dlpar.o
obj-$(CONFIG_PROC_FS) += proc_ppc64.o
obj-$(CONFIG_RTAS_FLASH) += rtas_flash.o
obj-$(CONFIG_SMP) += smp.o
Index: linux-2.6.14-rc2-git6/arch/ppc64/kernel/eeh.c
===================================================================
--- linux-2.6.14-rc2-git6.orig/arch/ppc64/kernel/eeh.c 2005-10-06 17:54:14.494455177 -0500
+++ linux-2.6.14-rc2-git6/arch/ppc64/kernel/eeh.c 2005-10-06 17:56:42.936627345 -0500
@@ -78,9 +78,6 @@
*/
#define EEH_MAX_FAILS 100000

-/* Misc forward declaraions */
-static void eeh_save_bars(struct pci_dev * pdev, struct pci_dn *pdn);
-
/* RTAS tokens */
static int ibm_set_eeh_option;
static int ibm_set_slot_reset;
@@ -108,296 +105,8 @@
static DEFINE_PER_CPU(unsigned long, ignored_failures);
static DEFINE_PER_CPU(unsigned long, slot_resets);

-/**
- * The pci address cache subsystem. This subsystem places
- * PCI device address resources into a red-black tree, sorted
- * according to the address range, so that given only an i/o
- * address, the corresponding PCI device can be **quickly**
- * found. It is safe to perform an address lookup in an interrupt
- * context; this ability is an important feature.
- *
- * Currently, the only customer of this code is the EEH subsystem;
- * thus, this code has been somewhat tailored to suit EEH better.
- * In particular, the cache does *not* hold the addresses of devices
- * for which EEH is not enabled.
- *
- * (Implementation Note: The RB tree seems to be better/faster
- * than any hash algo I could think of for this problem, even
- * with the penalty of slow pointer chases for d-cache misses).
- */
-struct pci_io_addr_range
-{
- struct rb_node rb_node;
- unsigned long addr_lo;
- unsigned long addr_hi;
- struct pci_dev *pcidev;
- unsigned int flags;
-};
-
-static struct pci_io_addr_cache
-{
- struct rb_root rb_root;
- spinlock_t piar_lock;
-} pci_io_addr_cache_root;
-
-static inline struct pci_dev *__pci_get_device_by_addr(unsigned long addr)
-{
- struct rb_node *n = pci_io_addr_cache_root.rb_root.rb_node;
-
- while (n) {
- struct pci_io_addr_range *piar;
- piar = rb_entry(n, struct pci_io_addr_range, rb_node);
-
- if (addr < piar->addr_lo) {
- n = n->rb_left;
- } else {
- if (addr > piar->addr_hi) {
- n = n->rb_right;
- } else {
- pci_dev_get(piar->pcidev);
- return piar->pcidev;
- }
- }
- }
-
- return NULL;
-}
-
-/**
- * pci_get_device_by_addr - Get device, given only address
- * @addr: mmio (PIO) phys address or i/o port number
- *
- * Given an mmio phys address, or a port number, find a pci device
- * that implements this address. Be sure to pci_dev_put the device
- * when finished. I/O port numbers are assumed to be offset
- * from zero (that is, they do *not* have pci_io_addr added in).
- * It is safe to call this function within an interrupt.
- */
-static struct pci_dev *pci_get_device_by_addr(unsigned long addr)
-{
- struct pci_dev *dev;
- unsigned long flags;
-
- spin_lock_irqsave(&pci_io_addr_cache_root.piar_lock, flags);
- dev = __pci_get_device_by_addr(addr);
- spin_unlock_irqrestore(&pci_io_addr_cache_root.piar_lock, flags);
- return dev;
-}
-
-#ifdef DEBUG
-/*
- * Handy-dandy debug print routine, does nothing more
- * than print out the contents of our addr cache.
- */
-static void pci_addr_cache_print(struct pci_io_addr_cache *cache)
-{
- struct rb_node *n;
- int cnt = 0;
-
- n = rb_first(&cache->rb_root);
- while (n) {
- struct pci_io_addr_range *piar;
- piar = rb_entry(n, struct pci_io_addr_range, rb_node);
- printk(KERN_DEBUG "PCI: %s addr range %d [%lx-%lx]: %s\n",
- (piar->flags & IORESOURCE_IO) ? "i/o" : "mem", cnt,
- piar->addr_lo, piar->addr_hi, pci_name(piar->pcidev));
- cnt++;
- n = rb_next(n);
- }
-}
-#endif
-
-/* Insert address range into the rb tree. */
-static struct pci_io_addr_range *
-pci_addr_cache_insert(struct pci_dev *dev, unsigned long alo,
- unsigned long ahi, unsigned int flags)
-{
- struct rb_node **p = &pci_io_addr_cache_root.rb_root.rb_node;
- struct rb_node *parent = NULL;
- struct pci_io_addr_range *piar;
-
- /* Walk tree, find a place to insert into tree */
- while (*p) {
- parent = *p;
- piar = rb_entry(parent, struct pci_io_addr_range, rb_node);
- if (ahi < piar->addr_lo) {
- p = &parent->rb_left;
- } else if (alo > piar->addr_hi) {
- p = &parent->rb_right;
- } else {
- if (dev != piar->pcidev ||
- alo != piar->addr_lo || ahi != piar->addr_hi) {
- printk(KERN_WARNING "PIAR: overlapping address range\n");
- }
- return piar;
- }
- }
- piar = (struct pci_io_addr_range *)kmalloc(sizeof(struct pci_io_addr_range), GFP_ATOMIC);
- if (!piar)
- return NULL;
-
- piar->addr_lo = alo;
- piar->addr_hi = ahi;
- piar->pcidev = dev;
- piar->flags = flags;
-
-#ifdef DEBUG
- printk(KERN_DEBUG "PIAR: insert range=[%lx:%lx] dev=%s\n",
- alo, ahi, pci_name (dev));
-#endif
-
- rb_link_node(&piar->rb_node, parent, p);
- rb_insert_color(&piar->rb_node, &pci_io_addr_cache_root.rb_root);
-
- return piar;
-}
-
-static void __pci_addr_cache_insert_device(struct pci_dev *dev)
-{
- struct device_node *dn;
- struct pci_dn *pdn;
- int i;
- int inserted = 0;
-
- dn = pci_device_to_OF_node(dev);
- if (!dn) {
- printk(KERN_WARNING "PCI: no pci dn found for dev=%s\n", pci_name(dev));
- return;
- }
-
- /* Skip any devices for which EEH is not enabled. */
- pdn = PCI_DN(dn);
- if (!(pdn->eeh_mode & EEH_MODE_SUPPORTED) ||
- pdn->eeh_mode & EEH_MODE_NOCHECK) {
-#ifdef DEBUG
- printk(KERN_INFO "PCI: skip building address cache for=%s - %s\n",
- pci_name(dev), pdn->node->full_name);
-#endif
- return;
- }
-
- /* The cache holds a reference to the device... */
- pci_dev_get(dev);
-
- /* Walk resources on this device, poke them into the tree */
- for (i = 0; i < DEVICE_COUNT_RESOURCE; i++) {
- unsigned long start = pci_resource_start(dev,i);
- unsigned long end = pci_resource_end(dev,i);
- unsigned int flags = pci_resource_flags(dev,i);
-
- /* We are interested only bus addresses, not dma or other stuff */
- if (0 == (flags & (IORESOURCE_IO | IORESOURCE_MEM)))
- continue;
- if (start == 0 || ~start == 0 || end == 0 || ~end == 0)
- continue;
- pci_addr_cache_insert(dev, start, end, flags);
- inserted = 1;
- }
-
- /* If there was nothing to add, the cache has no reference... */
- if (!inserted)
- pci_dev_put(dev);
-}
-
-/**
- * pci_addr_cache_insert_device - Add a device to the address cache
- * @dev: PCI device whose I/O addresses we are interested in.
- *
- * In order to support the fast lookup of devices based on addresses,
- * we maintain a cache of devices that can be quickly searched.
- * This routine adds a device to that cache.
- */
-static void pci_addr_cache_insert_device(struct pci_dev *dev)
-{
- unsigned long flags;
-
- spin_lock_irqsave(&pci_io_addr_cache_root.piar_lock, flags);
- __pci_addr_cache_insert_device(dev);
- spin_unlock_irqrestore(&pci_io_addr_cache_root.piar_lock, flags);
-}
-
-static inline void __pci_addr_cache_remove_device(struct pci_dev *dev)
-{
- struct rb_node *n;
- int removed = 0;
-
-restart:
- n = rb_first(&pci_io_addr_cache_root.rb_root);
- while (n) {
- struct pci_io_addr_range *piar;
- piar = rb_entry(n, struct pci_io_addr_range, rb_node);
-
- if (piar->pcidev == dev) {
- rb_erase(n, &pci_io_addr_cache_root.rb_root);
- removed = 1;
- kfree(piar);
- goto restart;
- }
- n = rb_next(n);
- }
-
- /* The cache no longer holds its reference to this device... */
- if (removed)
- pci_dev_put(dev);
-}
-
-/**
- * pci_addr_cache_remove_device - remove pci device from addr cache
- * @dev: device to remove
- *
- * Remove a device from the addr-cache tree.
- * This is potentially expensive, since it will walk
- * the tree multiple times (once per resource).
- * But so what; device removal doesn't need to be that fast.
- */
-static void pci_addr_cache_remove_device(struct pci_dev *dev)
-{
- unsigned long flags;
-
- spin_lock_irqsave(&pci_io_addr_cache_root.piar_lock, flags);
- __pci_addr_cache_remove_device(dev);
- spin_unlock_irqrestore(&pci_io_addr_cache_root.piar_lock, flags);
-}
-
-/**
- * pci_addr_cache_build - Build a cache of I/O addresses
- *
- * Build a cache of pci i/o addresses. This cache will be used to
- * find the pci device that corresponds to a given address.
- * This routine scans all pci busses to build the cache.
- * Must be run late in boot process, after the pci controllers
- * have been scaned for devices (after all device resources are known).
- */
-void __init pci_addr_cache_build(void)
-{
- struct device_node *dn;
- struct pci_dev *dev = NULL;
-
- if (!eeh_subsystem_enabled)
- return;
-
- spin_lock_init(&pci_io_addr_cache_root.piar_lock);
-
- while ((dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, dev)) != NULL) {
- /* Ignore PCI bridges ( XXX why ??) */
- if ((dev->class >> 16) == PCI_BASE_CLASS_BRIDGE) {
- continue;
- }
- pci_addr_cache_insert_device(dev);
-
- /* Save the BAR's; firmware doesn't restore these after EEH reset */
- dn = pci_device_to_OF_node(dev);
- eeh_save_bars(dev, PCI_DN(dn));
- }
-
-#ifdef DEBUG
- /* Verify tree built up above, echo back the list of addrs. */
- pci_addr_cache_print(&pci_io_addr_cache_root);
-#endif
-}
-
/* --------------------------------------------------------------- */
-/* Above lies the PCI Address Cache. Below lies the EEH event infrastructure */
+/* Below lies the EEH event infrastructure */

void eeh_slot_error_detail (struct pci_dn *pdn, int severity)
{
@@ -881,7 +590,7 @@
* PCI devices are added individuallly; but, for the restore,
* an entire slot is reset at a time.
*/
-static void eeh_save_bars(struct pci_dev * pdev, struct pci_dn *pdn)
+void eeh_save_bars(struct pci_dev * pdev, struct pci_dn *pdn)
{
int i;

Index: linux-2.6.14-rc2-git6/arch/ppc64/kernel/eeh_cache.c
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.14-rc2-git6/arch/ppc64/kernel/eeh_cache.c 2005-10-06 17:56:42.937627204 -0500
@@ -0,0 +1,316 @@
+/*
+ * eeh_cache.c
+ * PCI address cache; allows the lookup of PCI devices based on I/O address
+ *
+ * Copyright (C) 2004 Linas Vepstas <[email protected]> IBM Corporation
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+
+#include <linux/list.h>
+#include <linux/pci.h>
+#include <linux/rbtree.h>
+#include <linux/spinlock.h>
+#include <asm/atomic.h>
+#include <asm/systemcfg.h>
+#include "pci.h"
+
+#undef DEBUG
+
+/**
+ * The pci address cache subsystem. This subsystem places
+ * PCI device address resources into a red-black tree, sorted
+ * according to the address range, so that given only an i/o
+ * address, the corresponding PCI device can be **quickly**
+ * found. It is safe to perform an address lookup in an interrupt
+ * context; this ability is an important feature.
+ *
+ * Currently, the only customer of this code is the EEH subsystem;
+ * thus, this code has been somewhat tailored to suit EEH better.
+ * In particular, the cache does *not* hold the addresses of devices
+ * for which EEH is not enabled.
+ *
+ * (Implementation Note: The RB tree seems to be better/faster
+ * than any hash algo I could think of for this problem, even
+ * with the penalty of slow pointer chases for d-cache misses).
+ */
+struct pci_io_addr_range
+{
+ struct rb_node rb_node;
+ unsigned long addr_lo;
+ unsigned long addr_hi;
+ struct pci_dev *pcidev;
+ unsigned int flags;
+};
+
+static struct pci_io_addr_cache
+{
+ struct rb_root rb_root;
+ spinlock_t piar_lock;
+} pci_io_addr_cache_root;
+
+static inline struct pci_dev *__pci_get_device_by_addr(unsigned long addr)
+{
+ struct rb_node *n = pci_io_addr_cache_root.rb_root.rb_node;
+
+ while (n) {
+ struct pci_io_addr_range *piar;
+ piar = rb_entry(n, struct pci_io_addr_range, rb_node);
+
+ if (addr < piar->addr_lo) {
+ n = n->rb_left;
+ } else {
+ if (addr > piar->addr_hi) {
+ n = n->rb_right;
+ } else {
+ pci_dev_get(piar->pcidev);
+ return piar->pcidev;
+ }
+ }
+ }
+
+ return NULL;
+}
+
+/**
+ * pci_get_device_by_addr - Get device, given only address
+ * @addr: mmio (PIO) phys address or i/o port number
+ *
+ * Given an mmio phys address, or a port number, find a pci device
+ * that implements this address. Be sure to pci_dev_put the device
+ * when finished. I/O port numbers are assumed to be offset
+ * from zero (that is, they do *not* have pci_io_addr added in).
+ * It is safe to call this function within an interrupt.
+ */
+struct pci_dev *pci_get_device_by_addr(unsigned long addr)
+{
+ struct pci_dev *dev;
+ unsigned long flags;
+
+ spin_lock_irqsave(&pci_io_addr_cache_root.piar_lock, flags);
+ dev = __pci_get_device_by_addr(addr);
+ spin_unlock_irqrestore(&pci_io_addr_cache_root.piar_lock, flags);
+ return dev;
+}
+
+#ifdef DEBUG
+/*
+ * Handy-dandy debug print routine, does nothing more
+ * than print out the contents of our addr cache.
+ */
+static void pci_addr_cache_print(struct pci_io_addr_cache *cache)
+{
+ struct rb_node *n;
+ int cnt = 0;
+
+ n = rb_first(&cache->rb_root);
+ while (n) {
+ struct pci_io_addr_range *piar;
+ piar = rb_entry(n, struct pci_io_addr_range, rb_node);
+ printk(KERN_DEBUG "PCI: %s addr range %d [%lx-%lx]: %s\n",
+ (piar->flags & IORESOURCE_IO) ? "i/o" : "mem", cnt,
+ piar->addr_lo, piar->addr_hi, pci_name(piar->pcidev));
+ cnt++;
+ n = rb_next(n);
+ }
+}
+#endif
+
+/* Insert address range into the rb tree. */
+static struct pci_io_addr_range *
+pci_addr_cache_insert(struct pci_dev *dev, unsigned long alo,
+ unsigned long ahi, unsigned int flags)
+{
+ struct rb_node **p = &pci_io_addr_cache_root.rb_root.rb_node;
+ struct rb_node *parent = NULL;
+ struct pci_io_addr_range *piar;
+
+ /* Walk tree, find a place to insert into tree */
+ while (*p) {
+ parent = *p;
+ piar = rb_entry(parent, struct pci_io_addr_range, rb_node);
+ if (ahi < piar->addr_lo) {
+ p = &parent->rb_left;
+ } else if (alo > piar->addr_hi) {
+ p = &parent->rb_right;
+ } else {
+ if (dev != piar->pcidev ||
+ alo != piar->addr_lo || ahi != piar->addr_hi) {
+ printk(KERN_WARNING "PIAR: overlapping address range\n");
+ }
+ return piar;
+ }
+ }
+ piar = (struct pci_io_addr_range *)kmalloc(sizeof(struct pci_io_addr_range), GFP_ATOMIC);
+ if (!piar)
+ return NULL;
+
+ piar->addr_lo = alo;
+ piar->addr_hi = ahi;
+ piar->pcidev = dev;
+ piar->flags = flags;
+
+#ifdef DEBUG
+ printk(KERN_DEBUG "PIAR: insert range=[%lx:%lx] dev=%s\n",
+ alo, ahi, pci_name (dev));
+#endif
+
+ rb_link_node(&piar->rb_node, parent, p);
+ rb_insert_color(&piar->rb_node, &pci_io_addr_cache_root.rb_root);
+
+ return piar;
+}
+
+static void __pci_addr_cache_insert_device(struct pci_dev *dev)
+{
+ struct device_node *dn;
+ struct pci_dn *pdn;
+ int i;
+ int inserted = 0;
+
+ dn = pci_device_to_OF_node(dev);
+ if (!dn) {
+ printk(KERN_WARNING "PCI: no pci dn found for dev=%s\n", pci_name(dev));
+ return;
+ }
+
+ /* Skip any devices for which EEH is not enabled. */
+ pdn = PCI_DN(dn);
+ if (!(pdn->eeh_mode & EEH_MODE_SUPPORTED) ||
+ pdn->eeh_mode & EEH_MODE_NOCHECK) {
+#ifdef DEBUG
+ printk(KERN_INFO "PCI: skip building address cache for=%s - %s\n",
+ pci_name(dev), pdn->node->full_name);
+#endif
+ return;
+ }
+
+ /* The cache holds a reference to the device... */
+ pci_dev_get(dev);
+
+ /* Walk resources on this device, poke them into the tree */
+ for (i = 0; i < DEVICE_COUNT_RESOURCE; i++) {
+ unsigned long start = pci_resource_start(dev,i);
+ unsigned long end = pci_resource_end(dev,i);
+ unsigned int flags = pci_resource_flags(dev,i);
+
+ /* We are interested only bus addresses, not dma or other stuff */
+ if (0 == (flags & (IORESOURCE_IO | IORESOURCE_MEM)))
+ continue;
+ if (start == 0 || ~start == 0 || end == 0 || ~end == 0)
+ continue;
+ pci_addr_cache_insert(dev, start, end, flags);
+ inserted = 1;
+ }
+
+ /* If there was nothing to add, the cache has no reference... */
+ if (!inserted)
+ pci_dev_put(dev);
+}
+
+/**
+ * pci_addr_cache_insert_device - Add a device to the address cache
+ * @dev: PCI device whose I/O addresses we are interested in.
+ *
+ * In order to support the fast lookup of devices based on addresses,
+ * we maintain a cache of devices that can be quickly searched.
+ * This routine adds a device to that cache.
+ */
+void pci_addr_cache_insert_device(struct pci_dev *dev)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(&pci_io_addr_cache_root.piar_lock, flags);
+ __pci_addr_cache_insert_device(dev);
+ spin_unlock_irqrestore(&pci_io_addr_cache_root.piar_lock, flags);
+}
+
+static inline void __pci_addr_cache_remove_device(struct pci_dev *dev)
+{
+ struct rb_node *n;
+ int removed = 0;
+
+restart:
+ n = rb_first(&pci_io_addr_cache_root.rb_root);
+ while (n) {
+ struct pci_io_addr_range *piar;
+ piar = rb_entry(n, struct pci_io_addr_range, rb_node);
+
+ if (piar->pcidev == dev) {
+ rb_erase(n, &pci_io_addr_cache_root.rb_root);
+ removed = 1;
+ kfree(piar);
+ goto restart;
+ }
+ n = rb_next(n);
+ }
+
+ /* The cache no longer holds its reference to this device... */
+ if (removed)
+ pci_dev_put(dev);
+}
+
+/**
+ * pci_addr_cache_remove_device - remove pci device from addr cache
+ * @dev: device to remove
+ *
+ * Remove a device from the addr-cache tree.
+ * This is potentially expensive, since it will walk
+ * the tree multiple times (once per resource).
+ * But so what; device removal doesn't need to be that fast.
+ */
+void pci_addr_cache_remove_device(struct pci_dev *dev)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(&pci_io_addr_cache_root.piar_lock, flags);
+ __pci_addr_cache_remove_device(dev);
+ spin_unlock_irqrestore(&pci_io_addr_cache_root.piar_lock, flags);
+}
+
+/**
+ * pci_addr_cache_build - Build a cache of I/O addresses
+ *
+ * Build a cache of pci i/o addresses. This cache will be used to
+ * find the pci device that corresponds to a given address.
+ * This routine scans all pci busses to build the cache.
+ * Must be run late in boot process, after the pci controllers
+ * have been scaned for devices (after all device resources are known).
+ */
+void __init pci_addr_cache_build(void)
+{
+ struct device_node *dn;
+ struct pci_dev *dev = NULL;
+
+ spin_lock_init(&pci_io_addr_cache_root.piar_lock);
+
+ while ((dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, dev)) != NULL) {
+ /* Ignore PCI bridges */
+ if ((dev->class >> 16) == PCI_BASE_CLASS_BRIDGE)
+ continue;
+
+ pci_addr_cache_insert_device(dev);
+
+ /* Save the BAR's; firmware doesn't restore these after EEH reset */
+ dn = pci_device_to_OF_node(dev);
+ eeh_save_bars(dev, PCI_DN(dn));
+ }
+
+#ifdef DEBUG
+ /* Verify tree built up above, echo back the list of addrs. */
+ pci_addr_cache_print(&pci_io_addr_cache_root);
+#endif
+}
+
Index: linux-2.6.14-rc2-git6/arch/ppc64/kernel/pci.h
===================================================================
--- linux-2.6.14-rc2-git6.orig/arch/ppc64/kernel/pci.h 2005-10-06 17:54:14.497454757 -0500
+++ linux-2.6.14-rc2-git6/arch/ppc64/kernel/pci.h 2005-10-06 17:56:42.938627064 -0500
@@ -53,6 +53,14 @@

/* ---- EEH internal-use-only related routines ---- */
#ifdef CONFIG_EEH
+
+void pci_addr_cache_insert_device(struct pci_dev *dev);
+void pci_addr_cache_remove_device(struct pci_dev *dev);
+void pci_addr_cache_build(void);
+struct pci_dev *pci_get_device_by_addr(unsigned long addr);
+
+void eeh_save_bars(struct pci_dev * pdev, struct pci_dn *pdn);
+
/**
* eeh_slot_error_detail -- record and EEH error condition to the log
* @severity: 1 if temporary, 2 if permanent failure.

2005-10-06 23:54:39

by linas

[permalink] [raw]
Subject: [PATCH 17/22] ppc64: New Partition Endpoin support


17-eeh-partition-endpoint.patch

New versions of firmware introduce a new method by which the
"partition endpoint" (the point at which the pci bus is cut).
This code adds the support for this (mandatory) new feature.

Signed-off-by: Linas Vepstas <[email protected]>


Index: linux-2.6.14-rc2-git6/arch/ppc64/kernel/eeh.c
===================================================================
--- linux-2.6.14-rc2-git6.orig/arch/ppc64/kernel/eeh.c 2005-10-06 17:56:42.936627345 -0500
+++ linux-2.6.14-rc2-git6/arch/ppc64/kernel/eeh.c 2005-10-06 17:56:46.221166493 -0500
@@ -84,6 +84,7 @@
static int ibm_read_slot_reset_state;
static int ibm_read_slot_reset_state2;
static int ibm_slot_error_detail;
+static int ibm_get_config_addr_info;

static int eeh_subsystem_enabled;

@@ -458,6 +459,7 @@
static void
rtas_pci_slot_reset(struct pci_dn *pdn, int state)
{
+ int config_addr;
int rc;

BUG_ON (pdn==NULL);
@@ -468,8 +470,13 @@
return;
}

+ /* Use PE configuration address, if present */
+ config_addr = pdn->eeh_config_addr;
+ if (pdn->eeh_pe_config_addr)
+ config_addr = pdn->eeh_pe_config_addr;
+
rc = rtas_call(ibm_set_slot_reset,4,1, NULL,
- pdn->eeh_config_addr,
+ config_addr,
BUID_HI(pdn->phb->buid),
BUID_LO(pdn->phb->buid),
state);
@@ -696,8 +703,22 @@
eeh_subsystem_enabled = 1;
pdn->eeh_mode |= EEH_MODE_SUPPORTED;
pdn->eeh_config_addr = regs[0];
+
+ /* If the newer, better, ibm,get-config-addr-info is supported,
+ * then use that instead. */
+ pdn->eeh_pe_config_addr = 0;
+ if (ibm_get_config_addr_info != RTAS_UNKNOWN_SERVICE) {
+ unsigned int rets[2];
+ ret = rtas_call (ibm_get_config_addr_info, 4, 2, rets,
+ pdn->eeh_config_addr,
+ info->buid_hi, info->buid_lo,
+ 0);
+ if (ret == 0)
+ pdn->eeh_pe_config_addr = rets[0];
+ }
#ifdef DEBUG
- printk(KERN_DEBUG "EEH: %s: eeh enabled\n", dn->full_name);
+ printk(KERN_DEBUG "EEH: %s: eeh enabled, config=%x pe_config=%x\n",
+ dn->full_name, pdn->eeh_config_addr, pdn->eeh_pe_config_addr);
#endif
} else {

@@ -749,6 +770,7 @@
ibm_read_slot_reset_state2 = rtas_token("ibm,read-slot-reset-state2");
ibm_read_slot_reset_state = rtas_token("ibm,read-slot-reset-state");
ibm_slot_error_detail = rtas_token("ibm,slot-error-detail");
+ ibm_get_config_addr_info = rtas_token("ibm,get-config-addr-info");

if (ibm_set_eeh_option == RTAS_UNKNOWN_SERVICE)
return;
Index: linux-2.6.14-rc2-git6/include/asm-ppc64/pci-bridge.h
===================================================================
--- linux-2.6.14-rc2-git6.orig/include/asm-ppc64/pci-bridge.h 2005-10-06 17:54:00.310445328 -0500
+++ linux-2.6.14-rc2-git6/include/asm-ppc64/pci-bridge.h 2005-10-06 17:56:46.222166353 -0500
@@ -61,6 +61,7 @@
int devfn; /* for pci devices */
int eeh_mode; /* See eeh.h for possible EEH_MODEs */
int eeh_config_addr;
+ int eeh_pe_config_addr; /* new-style partition endpoint address */
int eeh_check_count; /* # times driver ignored error */
int eeh_freeze_count; /* # times this device froze up. */
int eeh_is_bridge; /* device is pci-to-pci bridge */

2005-10-06 23:55:46

by linas

[permalink] [raw]
Subject: [PATCH 18/22] PCI Error Recovery: IPR SCSI device driver


PCI Error Recovery: IPR SCSI device driver

Various PCI bus errors can be signaled by newer PCI controllers. This
patch adds the PCI error recovery callbacks to the IPR SCSI device driver.
The patch has been tested, and appears to work well.

Signed-off-by: Linas Vepstas <[email protected]>
Signed-off-by: Brian King <[email protected]>

--
arch/ppc64/configs/pSeries_defconfig | 1
drivers/scsi/Kconfig | 8 +++
drivers/scsi/ipr.c | 93 +++++++++++++++++++++++++++++++++++
3 files changed, 102 insertions(+)

Index: linux-2.6.14-rc2-git6/drivers/scsi/Kconfig
===================================================================
--- linux-2.6.14-rc2-git6.orig/drivers/scsi/Kconfig 2005-10-06 17:50:21.443154534 -0500
+++ linux-2.6.14-rc2-git6/drivers/scsi/Kconfig 2005-10-06 17:56:53.965079951 -0500
@@ -1087,6 +1087,14 @@
If you enable this support, the iprdump daemon can be used
to capture adapter failure analysis information.

+config SCSI_IPR_EEH_RECOVERY
+ bool "Enable PCI bus error recovery"
+ depends on SCSI_IPR && PPC_PSERIES
+ help
+ If you say Y here, the driver will be able to recover from
+ PCI bus errors on many PowerPC platforms. IBM pSeries users
+ should answer Y.
+
config SCSI_ZALON
tristate "Zalon SCSI support"
depends on GSC && SCSI
Index: linux-2.6.14-rc2-git6/drivers/scsi/ipr.c
===================================================================
--- linux-2.6.14-rc2-git6.orig/drivers/scsi/ipr.c 2005-10-06 17:50:21.444154394 -0500
+++ linux-2.6.14-rc2-git6/drivers/scsi/ipr.c 2005-10-06 17:56:53.972078969 -0500
@@ -5326,6 +5326,94 @@
shutdown_type);
}

+#ifdef CONFIG_SCSI_IPR_EEH_RECOVERY
+
+/** If the PCI slot is frozen, hold off all i/o
+ * activity; then, as soon as the slot is available again,
+ * initiate an adapter reset.
+ */
+static int ipr_reset_freeze(struct ipr_cmnd *ipr_cmd)
+{
+ /* Disallow new interrupts, avoid loop */
+ ipr_cmd->ioa_cfg->allow_interrupts = 0;
+ list_add_tail(&ipr_cmd->queue, &ipr_cmd->ioa_cfg->pending_q);
+ ipr_cmd->done = ipr_reset_ioa_job;
+ return IPR_RC_JOB_RETURN;
+}
+
+/** ipr_eeh_frozen -- called when slot has experience PCI bus error.
+ * This routine is called to tell us that the PCI bus is down.
+ * Can't do anything here, except put the device driver into a
+ * holding pattern, waiting for the PCI bus to come back.
+ */
+static void ipr_eeh_frozen (struct pci_dev *pdev)
+{
+ unsigned long flags = 0;
+ struct ipr_ioa_cfg *ioa_cfg = pci_get_drvdata(pdev);
+
+ spin_lock_irqsave(ioa_cfg->host->host_lock, flags);
+ _ipr_initiate_ioa_reset(ioa_cfg, ipr_reset_freeze, IPR_SHUTDOWN_NONE);
+ spin_unlock_irqrestore(ioa_cfg->host->host_lock, flags);
+}
+
+/** ipr_eeh_slot_reset - called when pci slot has been reset.
+ *
+ * This routine is called by the pci error recovery recovery
+ * code after the PCI slot has been reset, just before we
+ * should resume normal operations.
+ */
+static int ipr_eeh_slot_reset(struct pci_dev *pdev)
+{
+ unsigned long flags = 0;
+ struct ipr_ioa_cfg *ioa_cfg = pci_get_drvdata(pdev);
+
+ // pci_enable_device(pdev);
+ // pci_set_master(pdev);
+ spin_lock_irqsave(ioa_cfg->host->host_lock, flags);
+ _ipr_initiate_ioa_reset(ioa_cfg, ipr_reset_restore_cfg_space,
+ IPR_SHUTDOWN_NONE);
+ spin_unlock_irqrestore(ioa_cfg->host->host_lock, flags);
+
+ return PCIERR_RESULT_RECOVERED;
+}
+
+/** This routine is called when the PCI bus has permanently
+ * failed. This routine should purge all pending I/O and
+ * shut down the device driver (close and unload).
+ */
+static void ipr_eeh_perm_failure(struct pci_dev *pdev)
+{
+ unsigned long flags = 0;
+ struct ipr_ioa_cfg *ioa_cfg = pci_get_drvdata(pdev);
+
+ spin_lock_irqsave(ioa_cfg->host->host_lock, flags);
+ if (ioa_cfg->sdt_state == WAIT_FOR_DUMP)
+ ioa_cfg->sdt_state = ABORT_DUMP;
+ ioa_cfg->reset_retries = IPR_NUM_RESET_RELOAD_RETRIES;
+ ioa_cfg->in_ioa_bringdown = 1;
+ ipr_initiate_ioa_reset(ioa_cfg, IPR_SHUTDOWN_NONE);
+ spin_unlock_irqrestore(ioa_cfg->host->host_lock, flags);
+}
+
+static int ipr_eeh_error_detected(struct pci_dev *pdev,
+ enum pci_channel_state state)
+{
+ switch (state) {
+ case pci_channel_io_frozen:
+ ipr_eeh_frozen (pdev);
+ return PCIERR_RESULT_NEED_RESET;
+
+ case pci_channel_io_perm_failure:
+ ipr_eeh_perm_failure (pdev);
+ return PCIERR_RESULT_DISCONNECT;
+ break;
+ default:
+ break;
+ }
+ return PCIERR_RESULT_NEED_RESET;
+}
+#endif
+
/**
* ipr_probe_ioa_part2 - Initializes IOAs found in ipr_probe_ioa(..)
* @ioa_cfg: ioa cfg struct
@@ -6063,12 +6151,23 @@
};
MODULE_DEVICE_TABLE(pci, ipr_pci_table);

+
+#ifdef CONFIG_SCSI_IPR_EEH_RECOVERY
+static struct pci_error_handlers ipr_err_handler = {
+ .error_detected = ipr_eeh_error_detected,
+ .slot_reset = ipr_eeh_slot_reset,
+};
+#endif /* CONFIG_SCSI_IPR_EEH_RECOVERY */
+
static struct pci_driver ipr_driver = {
.name = IPR_NAME,
.id_table = ipr_pci_table,
.probe = ipr_probe,
.remove = ipr_remove,
.shutdown = ipr_shutdown,
+#ifdef CONFIG_SCSI_IPR_EEH_RECOVERY
+ .err_handler = &ipr_err_handler,
+#endif /* CONFIG_SCSI_IPR_EEH_RECOVERY */
};

/**
Index: linux-2.6.14-rc2-git6/arch/ppc64/configs/pSeries_defconfig
===================================================================
--- linux-2.6.14-rc2-git6.orig/arch/ppc64/configs/pSeries_defconfig 2005-10-06 17:50:21.444154394 -0500
+++ linux-2.6.14-rc2-git6/arch/ppc64/configs/pSeries_defconfig 2005-10-06 17:56:53.974078688 -0500
@@ -476,6 +476,7 @@
CONFIG_SCSI_IPR=y
CONFIG_SCSI_IPR_TRACE=y
CONFIG_SCSI_IPR_DUMP=y
+CONFIG_SCSI_IPR_EEH_RECOVERY=y
# CONFIG_SCSI_QLOGIC_FC is not set
# CONFIG_SCSI_QLOGIC_1280 is not set
CONFIG_SCSI_QLA2XXX=y

2005-10-06 23:56:40

by linas

[permalink] [raw]
Subject: [PATCH 19/22] PCI Error Recovery: Symbios SCSI device driver


PCI Error Recovery: Symbios SCSI device driver

Various PCI bus errors can be signaled by newer PCI controllers. This
patch adds the PCI error recovery callbacks to the Symbios SCSI device driver.
The patch has been tested, and appears to work well.

Signed-off-by: Linas Vepstas <[email protected]>

--
arch/ppc64/configs/pSeries_defconfig | 1
drivers/scsi/Kconfig | 8 ++
drivers/scsi/sym53c8xx_2/sym_glue.c | 124 +++++++++++++++++++++++++++++++++++
drivers/scsi/sym53c8xx_2/sym_glue.h | 4 +
drivers/scsi/sym53c8xx_2/sym_hipd.c | 16 ++++
5 files changed, 153 insertions(+)

Index: linux-2.6.14-rc2-git6/arch/ppc64/configs/pSeries_defconfig
===================================================================
--- linux-2.6.14-rc2-git6.orig/arch/ppc64/configs/pSeries_defconfig 2005-10-06 10:36:42.939820924 -0500
+++ linux-2.6.14-rc2-git6/arch/ppc64/configs/pSeries_defconfig 2005-10-06 10:36:46.735288291 -0500
@@ -473,6 +473,7 @@
CONFIG_SCSI_SYM53C8XX_DEFAULT_TAGS=16
CONFIG_SCSI_SYM53C8XX_MAX_TAGS=64
# CONFIG_SCSI_SYM53C8XX_IOMAPPED is not set
+CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY=y
CONFIG_SCSI_IPR=y
CONFIG_SCSI_IPR_TRACE=y
CONFIG_SCSI_IPR_DUMP=y
Index: linux-2.6.14-rc2-git6/drivers/scsi/Kconfig
===================================================================
--- linux-2.6.14-rc2-git6.orig/drivers/scsi/Kconfig 2005-10-06 10:36:42.913824572 -0500
+++ linux-2.6.14-rc2-git6/drivers/scsi/Kconfig 2005-10-06 10:36:46.738287870 -0500
@@ -1062,6 +1062,14 @@
the card. This is significantly slower then using memory
mapped IO. Most people should answer N.

+config SCSI_SYM53C8XX_EEH_RECOVERY
+ bool "Enable PCI bus error recovery"
+ depends on SCSI_SYM53C8XX_2 && PPC_PSERIES
+ help
+ If you say Y here, the driver will be able to recover from
+ PCI bus errors on many PowerPC platforms. IBM pSeries users
+ should answer Y.
+
config SCSI_IPR
tristate "IBM Power Linux RAID adapter support"
depends on PCI && SCSI
Index: linux-2.6.14-rc2-git6/drivers/scsi/sym53c8xx_2/sym_glue.c
===================================================================
--- linux-2.6.14-rc2-git6.orig/drivers/scsi/sym53c8xx_2/sym_glue.c 2005-10-06 10:32:48.850671732 -0500
+++ linux-2.6.14-rc2-git6/drivers/scsi/sym53c8xx_2/sym_glue.c 2005-10-06 10:36:46.741287449 -0500
@@ -685,6 +685,10 @@
struct sym_hcb *np = (struct sym_hcb *)dev_id;

if (DEBUG_FLAGS & DEBUG_TINY) printf_debug ("[");
+#ifdef CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY
+ if (np->s.io_state != pci_channel_io_normal)
+ return IRQ_HANDLED;
+#endif /* CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY */

spin_lock_irqsave(np->s.host->host_lock, flags);
sym_interrupt(np);
@@ -759,6 +763,27 @@
*/
static void sym_eh_timeout(u_long p) { __sym_eh_done((struct scsi_cmnd *)p, 1); }

+#ifdef CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY
+static void sym_eeh_timeout(u_long p)
+{
+ struct sym_eh_wait *ep = (struct sym_eh_wait *) p;
+ if (!ep)
+ return;
+ complete(&ep->done);
+}
+
+static void sym_eeh_done(struct sym_eh_wait *ep)
+{
+ if (!ep)
+ return;
+ ep->timed_out = 0;
+ if (!del_timer(&ep->timer))
+ return;
+
+ complete(&ep->done);
+}
+#endif /* CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY */
+
/*
* Generic method for our eh processing.
* The 'op' argument tells what we have to do.
@@ -799,6 +824,37 @@

/* Try to proceed the operation we have been asked for */
sts = -1;
+#ifdef CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY
+
+ /* We may be in an error condition because the PCI bus
+ * went down. In this case, we need to wait until the
+ * PCI bus is reset, the card is reset, and only then
+ * proceed with the scsi error recovery. We'll wait
+ * for 15 seconds for this to happen.
+ */
+#define WAIT_FOR_PCI_RECOVERY 15
+ if (np->s.io_state != pci_channel_io_normal) {
+ struct sym_eh_wait eeh, *eep = &eeh;
+ np->s.io_reset_wait = eep;
+ init_completion(&eep->done);
+ init_timer(&eep->timer);
+ eep->to_do = SYM_EH_DO_WAIT;
+ eep->timer.expires = jiffies + (WAIT_FOR_PCI_RECOVERY*HZ);
+ eep->timer.function = sym_eeh_timeout;
+ eep->timer.data = (u_long)eep;
+ eep->timed_out = 1; /* Be pessimistic for once :) */
+ add_timer(&eep->timer);
+ spin_unlock_irq(np->s.host->host_lock);
+ wait_for_completion(&eep->done);
+ spin_lock_irq(np->s.host->host_lock);
+ if (eep->timed_out) {
+ printk (KERN_ERR "%s: Timed out waiting for PCI reset\n",
+ sym_name(np));
+ }
+ np->s.io_reset_wait = NULL;
+ }
+#endif /* CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY */
+
switch(op) {
case SYM_EH_ABORT:
sts = sym_abort_scsiio(np, cmd, 1);
@@ -1584,6 +1640,10 @@
np->maxoffs = dev->chip.offset_max;
np->maxburst = dev->chip.burst_max;
np->myaddr = dev->host_id;
+#ifdef CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY
+ np->s.io_state = pci_channel_io_normal;
+ np->s.io_reset_wait = NULL;
+#endif /* CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY */

/*
* Edit its name.
@@ -1916,6 +1976,59 @@
return 1;
}

+#ifdef CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY
+/** sym2_io_error_detected() is called when PCI error is detected */
+static int sym2_io_error_detected (struct pci_dev *pdev, enum pci_channel_state state)
+{
+ struct sym_hcb *np = pci_get_drvdata(pdev);
+
+ np->s.io_state = state;
+ // XXX If slot is permanently frozen, then what?
+ // Should we scsi_remove_host() maybe ??
+
+ /* Request a slot slot reset. */
+ return PCIERR_RESULT_NEED_RESET;
+}
+
+/** sym2_io_slot_reset is called when the pci bus has been reset.
+ * Restart the card from scratch. */
+static int sym2_io_slot_reset (struct pci_dev *pdev)
+{
+ struct sym_hcb *np = pci_get_drvdata(pdev);
+
+ printk (KERN_INFO "%s: recovering from a PCI slot reset\n",
+ sym_name(np));
+
+ if (pci_enable_device(pdev))
+ printk (KERN_ERR "%s: device setup failed most egregiously\n",
+ sym_name(np));
+
+ pci_set_master(pdev);
+ enable_irq (pdev->irq);
+
+ /* Perform host reset only on one instance of the card */
+ if (0 == PCI_FUNC (pdev->devfn))
+ sym_reset_scsi_bus(np, 0);
+
+ return PCIERR_RESULT_RECOVERED;
+}
+
+/** sym2_io_resume is called when the error recovery driver
+ * tells us that its OK to resume normal operation.
+ */
+static void sym2_io_resume (struct pci_dev *pdev)
+{
+ struct sym_hcb *np = pci_get_drvdata(pdev);
+
+ /* Perform device startup only once for this card. */
+ if (0 == PCI_FUNC (pdev->devfn))
+ sym_start_up (np, 1);
+
+ np->s.io_state = pci_channel_io_normal;
+ sym_eeh_done (np->s.io_reset_wait);
+}
+#endif /* CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY */
+
/*
* Driver host template.
*/
@@ -2169,11 +2282,22 @@

MODULE_DEVICE_TABLE(pci, sym2_id_table);

+#ifdef CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY
+static struct pci_error_handlers sym2_err_handler = {
+ .error_detected = sym2_io_error_detected,
+ .slot_reset = sym2_io_slot_reset,
+ .resume = sym2_io_resume,
+};
+#endif /* CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY */
+
static struct pci_driver sym2_driver = {
.name = NAME53C8XX,
.id_table = sym2_id_table,
.probe = sym2_probe,
.remove = __devexit_p(sym2_remove),
+#ifdef CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY
+ .err_handler = &sym2_err_handler,
+#endif /* CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY */
};

static int __init sym2_init(void)
Index: linux-2.6.14-rc2-git6/drivers/scsi/sym53c8xx_2/sym_glue.h
===================================================================
--- linux-2.6.14-rc2-git6.orig/drivers/scsi/sym53c8xx_2/sym_glue.h 2005-10-06 10:32:48.851671592 -0500
+++ linux-2.6.14-rc2-git6/drivers/scsi/sym53c8xx_2/sym_glue.h 2005-10-06 10:36:46.742287309 -0500
@@ -181,6 +181,10 @@
char chip_name[8];
struct pci_dev *device;

+ /* pci bus i/o state; waiter for clearing of i/o state */
+ enum pci_channel_state io_state;
+ struct sym_eh_wait *io_reset_wait;
+
struct Scsi_Host *host;

void __iomem * ioaddr; /* MMIO kernel io address */
Index: linux-2.6.14-rc2-git6/drivers/scsi/sym53c8xx_2/sym_hipd.c
===================================================================
--- linux-2.6.14-rc2-git6.orig/drivers/scsi/sym53c8xx_2/sym_hipd.c 2005-10-06 10:32:48.851671592 -0500
+++ linux-2.6.14-rc2-git6/drivers/scsi/sym53c8xx_2/sym_hipd.c 2005-10-06 10:36:46.749286327 -0500
@@ -2806,6 +2806,7 @@
u_char istat, istatc;
u_char dstat;
u_short sist;
+ u_int icnt;

/*
* interrupt on the fly ?
@@ -2847,6 +2848,7 @@
sist = 0;
dstat = 0;
istatc = istat;
+ icnt = 0;
do {
if (istatc & SIP)
sist |= INW(np, nc_sist);
@@ -2854,6 +2856,20 @@
dstat |= INB(np, nc_dstat);
istatc = INB(np, nc_istat);
istat |= istatc;
+#ifdef CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY
+ /* Prevent deadlock waiting on a condition that may never clear. */
+ /* XXX this is a temporary kludge; the correct to detect
+ * a PCI bus error would be to use the io_check interfaces
+ * proposed by Hidetoshi Seto <[email protected]>
+ * Problem with polling like that is the state flag might not
+ * be set.
+ */
+ icnt ++;
+ if (100 < icnt) {
+ if (np->s.device->error_state != pci_channel_io_normal)
+ return;
+ }
+#endif /* CONFIG_SCSI_SYM53C8XX_EEH_RECOVERY */
} while (istatc & (SIP|DIP));

if (DEBUG_FLAGS & DEBUG_TINY)

2005-10-06 23:57:39

by linas

[permalink] [raw]
Subject: [PATCH 20/22] PCI Error Recovery: e100 network device driver


PCI Error Recovery: e100 network device driver

Various PCI bus errors can be signaled by newer PCI controllers. This
patch adds the PCI error recovery callbacks to the intel ethernet e100
device driver. The patch has been tested, and appears to work well.

Signed-off-by: Linas Vepstas <[email protected]>

--
arch/ppc64/configs/pSeries_defconfig | 1
drivers/net/Kconfig | 8 +++
drivers/net/e100.c | 73 +++++++++++++++++++++++++++++++++++
3 files changed, 82 insertions(+)

Index: linux-2.6.14-rc2-git6/arch/ppc64/configs/pSeries_defconfig
===================================================================
--- linux-2.6.14-rc2-git6.orig/arch/ppc64/configs/pSeries_defconfig 2005-09-27 16:15:29.957254295 -0500
+++ linux-2.6.14-rc2-git6/arch/ppc64/configs/pSeries_defconfig 2005-09-27 16:23:17.992430104 -0500
@@ -574,6 +574,7 @@
# CONFIG_DGRS is not set
# CONFIG_EEPRO100 is not set
CONFIG_E100=y
+CONFIG_E100_EEH_RECOVERY=y
# CONFIG_FEALNX is not set
# CONFIG_NATSEMI is not set
# CONFIG_NE2K_PCI is not set
Index: linux-2.6.14-rc2-git6/drivers/net/Kconfig
===================================================================
--- linux-2.6.14-rc2-git6.orig/drivers/net/Kconfig 2005-09-27 14:35:57.000000000 -0500
+++ linux-2.6.14-rc2-git6/drivers/net/Kconfig 2005-09-27 16:23:17.993429963 -0500
@@ -1394,6 +1394,14 @@
<file:Documentation/networking/net-modules.txt>. The module
will be called e100.

+config E100_EEH_RECOVERY
+ bool "Enable PCI bus error recovery"
+ depends on E100 && PPC_PSERIES
+ help
+ If you say Y here, the driver will be able to recover from
+ PCI bus errors on many PowerPC platforms. IBM pSeries users
+ should answer Y.
+
config LNE390
tristate "Mylex EISA LNE390A/B support (EXPERIMENTAL)"
depends on NET_PCI && EISA && EXPERIMENTAL
Index: linux-2.6.14-rc2-git6/drivers/net/e100.c
===================================================================
--- linux-2.6.14-rc2-git6.orig/drivers/net/e100.c 2005-09-27 14:35:57.825425161 -0500
+++ linux-2.6.14-rc2-git6/drivers/net/e100.c 2005-09-27 16:23:48.110194710 -0500
@@ -2650,6 +2650,76 @@
#endif
}

+#ifdef CONFIG_E100_EEH_RECOVERY
+
+/** e100_io_error_detected() is called when PCI error is detected */
+static int e100_io_error_detected(struct pci_dev *pdev, enum pci_channel_state state)
+{
+ struct net_device *netdev = pci_get_drvdata(pdev);
+
+ /* Same as calling e100_down(netdev_priv(netdev)), but generic */
+ netdev->stop(netdev);
+
+ /* Is a detach needed ?? */
+ // netif_device_detach(netdev);
+
+ /* Request a slot reset. */
+ return PCIERR_RESULT_NEED_RESET;
+}
+
+/** e100_io_slot_reset is called after the pci bus has been reset.
+ * Restart the card from scratch. */
+static int e100_io_slot_reset(struct pci_dev *pdev)
+{
+ struct net_device *netdev = pci_get_drvdata(pdev);
+ struct nic *nic = netdev_priv(netdev);
+
+ if(pci_enable_device(pdev)) {
+ printk(KERN_ERR "e100: Cannot re-enable PCI device after reset.\n");
+ return PCIERR_RESULT_DISCONNECT;
+ }
+ pci_set_master(pdev);
+
+ /* Only one device per card can do a reset */
+ if (0 != PCI_FUNC (pdev->devfn))
+ return PCIERR_RESULT_RECOVERED;
+
+ e100_hw_reset(nic);
+ e100_phy_init(nic);
+
+ if(e100_hw_init(nic)) {
+ DPRINTK(HW, ERR, "e100_hw_init failed\n");
+ return PCIERR_RESULT_DISCONNECT;
+ }
+
+ return PCIERR_RESULT_RECOVERED;
+}
+
+/** e100_io_resume is called when the error recovery driver
+ * tells us that its OK to resume normal operation.
+ */
+static void e100_io_resume(struct pci_dev *pdev)
+{
+ struct net_device *netdev = pci_get_drvdata(pdev);
+ struct nic *nic = netdev_priv(netdev);
+
+ /* ack any pending wake events, disable PME */
+ pci_enable_wake(pdev, 0, 0);
+
+ netif_device_attach(netdev);
+ if(netif_running(netdev))
+ e100_open (netdev);
+
+ mod_timer(&nic->watchdog, jiffies);
+}
+
+static struct pci_error_handlers e100_err_handler = {
+ .error_detected = e100_io_error_detected,
+ .slot_reset = e100_io_slot_reset,
+ .resume = e100_io_resume,
+};
+
+#endif /* CONFIG_E100_EEH_RECOVERY */

static struct pci_driver e100_driver = {
.name = DRV_NAME,
@@ -2661,6 +2731,9 @@
.resume = e100_resume,
#endif
.shutdown = e100_shutdown,
+#ifdef CONFIG_E100_EEH_RECOVERY
+ .err_handler = &e100_err_handler,
+#endif /* CONFIG_E100_EEH_RECOVERY */
};

static int __init e100_init_module(void)

2005-10-06 23:58:21

by linas

[permalink] [raw]
Subject: [PATCH 21/22] PCI Error Recovery: e1000 network device driver


PCI Error Recovery: e1000 network device driver

Various PCI bus errors can be signaled by newer PCI controllers. This
patch adds the PCI error recovery callbacks to the intel gigabit
ethernet e1000 device driver. The patch has been tested, and appears
to work well.

Signed-off-by: Linas Vepstas <[email protected]>

--
arch/ppc64/configs/pSeries_defconfig | 1
drivers/net/Kconfig | 8 ++
drivers/net/e1000/e1000_main.c | 103 ++++++++++++++++++++++++++++++++++-
3 files changed, 111 insertions(+), 1 deletion(-)

Index: linux-2.6.14-rc2-git6/arch/ppc64/configs/pSeries_defconfig
===================================================================
--- linux-2.6.14-rc2-git6.orig/arch/ppc64/configs/pSeries_defconfig 2005-10-06 17:47:05.582635736 -0500
+++ linux-2.6.14-rc2-git6/arch/ppc64/configs/pSeries_defconfig 2005-10-06 17:47:12.737631817 -0500
@@ -593,6 +593,7 @@
# CONFIG_DL2K is not set
CONFIG_E1000=y
# CONFIG_E1000_NAPI is not set
+CONFIG_E1000_EEH_RECOVERY=y
# CONFIG_NS83820 is not set
# CONFIG_HAMACHI is not set
# CONFIG_YELLOWFIN is not set
Index: linux-2.6.14-rc2-git6/drivers/net/Kconfig
===================================================================
--- linux-2.6.14-rc2-git6.orig/drivers/net/Kconfig 2005-10-06 17:47:05.582635736 -0500
+++ linux-2.6.14-rc2-git6/drivers/net/Kconfig 2005-10-06 17:47:12.742631116 -0500
@@ -1856,6 +1856,14 @@

If in doubt, say N.

+config E1000_EEH_RECOVERY
+ bool "Enable PCI bus error recovery"
+ depends on E1000 && PPC_PSERIES
+ help
+ If you say Y here, the driver will be able to recover from
+ PCI bus errors on many PowerPC platforms. IBM pSeries users
+ should answer Y.
+
config MYRI_SBUS
tristate "MyriCOM Gigabit Ethernet support"
depends on SBUS
Index: linux-2.6.14-rc2-git6/drivers/net/e1000/e1000_main.c
===================================================================
--- linux-2.6.14-rc2-git6.orig/drivers/net/e1000/e1000_main.c 2005-10-06 17:47:05.582635736 -0500
+++ linux-2.6.14-rc2-git6/drivers/net/e1000/e1000_main.c 2005-10-06 17:47:36.880244362 -0500
@@ -172,6 +172,18 @@
static void e1000_netpoll (struct net_device *netdev);
#endif

+#ifdef CONFIG_E1000_EEH_RECOVERY
+static int e1000_io_error_detected(struct pci_dev *pdev, enum pci_channel_state state);
+static int e1000_io_slot_reset(struct pci_dev *pdev);
+static void e1000_io_resume(struct pci_dev *pdev);
+
+static struct pci_error_handlers e1000_err_handler = {
+ .error_detected = e1000_io_error_detected,
+ .slot_reset = e1000_io_slot_reset,
+ .resume = e1000_io_resume,
+};
+#endif /* CONFIG_E1000_EEH_RECOVERY */
+
/* Exported from other modules */

extern void e1000_check_options(struct e1000_adapter *adapter);
@@ -184,8 +196,11 @@
/* Power Managment Hooks */
#ifdef CONFIG_PM
.suspend = e1000_suspend,
- .resume = e1000_resume
+ .resume = e1000_resume,
#endif
+#ifdef CONFIG_E1000_EEH_RECOVERY
+ .err_handler = &e1000_err_handler,
+#endif /* CONFIG_E1000_EEH_RECOVERY */
};

MODULE_AUTHOR("Intel Corporation, <[email protected]>");
@@ -2446,6 +2461,12 @@

#define PHY_IDLE_ERROR_COUNT_MASK 0x00FF

+#ifdef CONFIG_E1000_EEH_RECOVERY
+ /* Prevent stats update while adapter is being reset */
+ if (adapter->link_speed == 0)
+ return;
+#endif /* CONFIG_E1000_EEH_RECOVERY */
+
spin_lock_irqsave(&adapter->stats_lock, flags);

/* these counters are modified from e1000_adjust_tbi_stats,
@@ -3791,4 +3812,90 @@
}
#endif

+#ifdef CONFIG_E1000_EEH_RECOVERY
+
+/** e1000_io_error_detected() is called when PCI error is detected */
+static int e1000_io_error_detected(struct pci_dev *pdev, enum pci_channel_state state)
+{
+ struct net_device *netdev = pci_get_drvdata(pdev);
+ struct e1000_adapter *adapter = netdev->priv;
+
+ if (netif_running(netdev))
+ e1000_down(adapter);
+
+ /* Request a slot slot reset. */
+ return PCIERR_RESULT_NEED_RESET;
+}
+
+/** e1000_io_slot_reset is called after the pci bus has been reset.
+ * Restart the card from scratch.
+ * Implementation resembles the first-half of the
+ * e1000_resume routine.
+ */
+static int e1000_io_slot_reset(struct pci_dev *pdev)
+{
+ struct net_device *netdev = pci_get_drvdata(pdev);
+ struct e1000_adapter *adapter = netdev->priv;
+
+ if (pci_enable_device(pdev)) {
+ printk(KERN_ERR "e1000: Cannot re-enable PCI device after reset.\n");
+ return PCIERR_RESULT_DISCONNECT;
+ }
+ pci_set_master(pdev);
+
+ pci_enable_wake(pdev, 3, 0);
+ pci_enable_wake(pdev, 4, 0); /* 4 == D3 cold */
+
+ /* Perform card reset only on one instance of the card */
+ if (0 != PCI_FUNC (pdev->devfn))
+ return PCIERR_RESULT_RECOVERED;
+
+ e1000_reset(adapter);
+ E1000_WRITE_REG(&adapter->hw, WUS, ~0);
+
+ return PCIERR_RESULT_RECOVERED;
+}
+
+/** e1000_io_resume is called when the error recovery driver
+ * tells us that its OK to resume normal operation.
+ * Implementation resembles the second-half of the
+ * e1000_resume routine.
+ */
+static void e1000_io_resume(struct pci_dev *pdev)
+{
+ struct net_device *netdev = pci_get_drvdata(pdev);
+ struct e1000_adapter *adapter = netdev->priv;
+ uint32_t manc, swsm;
+
+ if (netif_running(netdev)) {
+ if (e1000_up(adapter)) {
+ printk("e1000: can't bring device back up after reset\n");
+ return;
+ }
+ }
+
+ netif_device_attach(netdev);
+
+ if (adapter->hw.mac_type >= e1000_82540 &&
+ adapter->hw.media_type == e1000_media_type_copper) {
+ manc = E1000_READ_REG(&adapter->hw, MANC);
+ manc &= ~(E1000_MANC_ARP_EN);
+ E1000_WRITE_REG(&adapter->hw, MANC, manc);
+ }
+
+ switch(adapter->hw.mac_type) {
+ case e1000_82573:
+ swsm = E1000_READ_REG(&adapter->hw, SWSM);
+ E1000_WRITE_REG(&adapter->hw, SWSM,
+ swsm | E1000_SWSM_DRV_LOAD);
+ break;
+ default:
+ break;
+ }
+
+ mod_timer(&adapter->watchdog_timer, jiffies);
+}
+
+#endif /* CONFIG_E1000_EEH_RECOVERY */
+
/* e1000_main.c */

2005-10-06 23:59:22

by linas

[permalink] [raw]
Subject: [PATCH 22/22] PCI Error Recovery: ixgb network device driver


PCI Error Recovery: ixgb network device driver

Various PCI bus errors can be signaled by newer PCI controllers. This
patch adds the PCI error recovery callbacks to the intel ten-gigabit
ethernet ixgb device driver. The patch has been tested, and appears
to work well.

Signed-off-by: Linas Vepstas <[email protected]>

--
arch/ppc64/configs/pSeries_defconfig | 1
drivers/net/Kconfig | 8 +++
drivers/net/ixgb/ixgb_main.c | 78 +++++++++++++++++++++++++++++++++++
3 files changed, 87 insertions(+)

Index: linux-2.6.14-rc2-git6/arch/ppc64/configs/pSeries_defconfig
===================================================================
--- linux-2.6.14-rc2-git6.orig/arch/ppc64/configs/pSeries_defconfig 2005-10-05 16:55:25.109651477 -0500
+++ linux-2.6.14-rc2-git6/arch/ppc64/configs/pSeries_defconfig 2005-10-05 16:55:26.410469062 -0500
@@ -610,6 +610,7 @@
#
CONFIG_IXGB=m
# CONFIG_IXGB_NAPI is not set
+CONFIG_IXGB_EEH_RECOVERY=y
CONFIG_S2IO=m
# CONFIG_S2IO_NAPI is not set
# CONFIG_2BUFF_MODE is not set
Index: linux-2.6.14-rc2-git6/drivers/net/Kconfig
===================================================================
--- linux-2.6.14-rc2-git6.orig/drivers/net/Kconfig 2005-10-05 16:55:25.114650776 -0500
+++ linux-2.6.14-rc2-git6/drivers/net/Kconfig 2005-10-05 16:55:26.414468501 -0500
@@ -2195,6 +2195,14 @@

If in doubt, say N.

+config IXGB_EEH_RECOVERY
+ bool "Enable PCI bus error recovery"
+ depends on IXGB && PPC_PSERIES
+ help
+ If you say Y here, the driver will be able to recover from
+ PCI bus errors on many PowerPC platforms. IBM pSeries users
+ should answer Y.
+
config S2IO
tristate "S2IO 10Gbe XFrame NIC"
depends on PCI
Index: linux-2.6.14-rc2-git6/drivers/net/ixgb/ixgb_main.c
===================================================================
--- linux-2.6.14-rc2-git6.orig/drivers/net/ixgb/ixgb_main.c 2005-10-05 16:54:33.590875982 -0500
+++ linux-2.6.14-rc2-git6/drivers/net/ixgb/ixgb_main.c 2005-10-05 17:00:08.092967727 -0500
@@ -132,6 +132,18 @@
static void ixgb_netpoll(struct net_device *dev);
#endif

+#ifdef CONFIG_IXGB_EEH_RECOVERY
+static int ixgb_io_error_detected (struct pci_dev *pdev, enum pci_channel_state state);
+static int ixgb_io_slot_reset (struct pci_dev *pdev);
+static void ixgb_io_resume (struct pci_dev *pdev);
+
+static struct pci_error_handlers ixgb_err_handler = {
+ .error_detected = ixgb_io_error_detected,
+ .slot_reset = ixgb_io_slot_reset,
+ .resume = ixgb_io_resume,
+};
+#endif /* CONFIG_IXGB_EEH_RECOVERY */
+
/* Exported from other modules */

extern void ixgb_check_options(struct ixgb_adapter *adapter);
@@ -141,6 +153,10 @@
.id_table = ixgb_pci_tbl,
.probe = ixgb_probe,
.remove = __devexit_p(ixgb_remove),
+#ifdef CONFIG_IXGB_EEH_RECOVERY
+ .err_handler = &ixgb_err_handler,
+#endif /* CONFIG_IXGB_EEH_RECOVERY */
+
};

MODULE_AUTHOR("Intel Corporation, <[email protected]>");
@@ -1653,8 +1669,16 @@
unsigned int i;
#endif

+#ifdef XXX_CONFIG_IXGB_EEH_RECOVERY
+ if(unlikely(icr==EEH_IO_ERROR_VALUE(4))) {
+ if (eeh_slot_is_isolated (adapter->pdev))
+ // disable_irq_nosync (adapter->pdev->irq);
+ return IRQ_NONE; /* Not our interrupt */
+ }
+#else
if(unlikely(!icr))
return IRQ_NONE; /* Not our interrupt */
+#endif /* CONFIG_IXGB_EEH_RECOVERY */

if(unlikely(icr & (IXGB_INT_RXSEQ | IXGB_INT_LSC))) {
mod_timer(&adapter->watchdog_timer, jiffies);
@@ -2124,4 +2148,71 @@
}
#endif

+#ifdef CONFIG_IXGB_EEH_RECOVERY
+
+/** ixgb_io_error_detected() is called when PCI error is detected */
+static int ixgb_io_error_detected (struct pci_dev *pdev, enum pci_channel_state state)
+{
+ struct net_device *netdev = pci_get_drvdata(pdev);
+ struct ixgb_adapter *adapter = netdev->priv;
+
+ if(netif_running(netdev))
+ ixgb_down(adapter, TRUE);
+
+ /* Request a slot reset. */
+ return PCIERR_RESULT_NEED_RESET;
+}
+
+/** ixgb_io_slot_reset is called after the pci bus has been reset.
+ * Restart the card from scratch.
+ * Implementation resembles the first-half of the
+ * ixgb_resume routine.
+ */
+static int ixgb_io_slot_reset (struct pci_dev *pdev)
+{
+ struct net_device *netdev = pci_get_drvdata(pdev);
+ struct ixgb_adapter *adapter = netdev->priv;
+
+ if(pci_enable_device(pdev)) {
+ printk(KERN_ERR "ixgb: Cannot re-enable PCI device after reset.\n");
+ return PCIERR_RESULT_DISCONNECT;
+ }
+ pci_set_master(pdev);
+
+ /* Perform card reset only on one instance of the card */
+ if (0 != PCI_FUNC (pdev->devfn))
+ return PCIERR_RESULT_RECOVERED;
+
+ ixgb_reset(adapter);
+
+ return PCIERR_RESULT_RECOVERED;
+}
+
+/** ixgb_io_resume is called when the error recovery driver
+ * tells us that its OK to resume normal operation.
+ * Implementation resembles the second-half of the
+ * ixgb_resume routine.
+ */
+static void ixgb_io_resume (struct pci_dev *pdev)
+{
+ struct net_device *netdev = pci_get_drvdata(pdev);
+ struct ixgb_adapter *adapter = netdev->priv;
+
+ if(netif_running(netdev)) {
+ if(ixgb_up(adapter)) {
+ printk ("ixgb: can't bring device back up after reset\n");
+ return;
+ }
+ }
+
+ netif_device_attach(netdev);
+ mod_timer(&adapter->watchdog_timer, jiffies);
+
+ /* Reading all-ff's from the adapter will completely hose
+ * the counts and statistics. So just clear them out */
+ memset(&adapter->stats, 0, sizeof(struct ixgb_hw_stats));
+ ixgb_update_stats(adapter);
+}
+#endif /* CONFIG_IXGB_EEH_RECOVERY */
+
/* ixgb_main.c */

2005-10-11 00:11:39

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 20/22] PCI Error Recovery: e100 network device driver

On Thu, Oct 06, 2005 at 06:57:29PM -0500, linas wrote:
> +config E100_EEH_RECOVERY
> + bool "Enable PCI bus error recovery"
> + depends on E100 && PPC_PSERIES
> + help
> + If you say Y here, the driver will be able to recover from
> + PCI bus errors on many PowerPC platforms. IBM pSeries users
> + should answer Y.

Why make a config option for this at all? Who would turn it off?

> @@ -2661,6 +2731,9 @@
> .resume = e100_resume,
> #endif
> .shutdown = e100_shutdown,
> +#ifdef CONFIG_E100_EEH_RECOVERY
> + .err_handler = &e100_err_handler,
> +#endif /* CONFIG_E100_EEH_RECOVERY */

No, don't put #ifdefs in the middle of a structure, remember we made
err_handler always present in the .h file for a reason...

thanks,

greg k-h

2005-10-11 23:04:14

by linas

[permalink] [raw]
Subject: Re: [PATCH 20/22] PCI Error Recovery: e100 network device driver

On Mon, Oct 10, 2005 at 05:10:56PM -0700, Greg KH was heard to remark:
> On Thu, Oct 06, 2005 at 06:57:29PM -0500, linas wrote:
> > +config E100_EEH_RECOVERY
> > + bool "Enable PCI bus error recovery"
> > + depends on E100 && PPC_PSERIES
> > + help
> > + If you say Y here, the driver will be able to recover from
> > + PCI bus errors on many PowerPC platforms. IBM pSeries users
> > + should answer Y.
>
> Why make a config option for this at all? Who would turn it off?

I wanted to have this turned off for anyone who didn't have
hardware capable of supporting this, and didn't really think
about how to hide this from the menu. I guess its best to
just plain hide this, keep the menus from getting cluttered.

> > @@ -2661,6 +2731,9 @@
> > .resume = e100_resume,
> > #endif
> > .shutdown = e100_shutdown,
> > +#ifdef CONFIG_E100_EEH_RECOVERY
> > + .err_handler = &e100_err_handler,
> > +#endif /* CONFIG_E100_EEH_RECOVERY */
>
> No, don't put #ifdefs in the middle of a structure, remember we made
> err_handler always present in the .h file for a reason...

OK.

I'll send revised patches patches tommorrw, hiding the config, and
removing the ifdef.

--linas

2005-10-12 05:57:43

by Paul Mackerras

[permalink] [raw]
Subject: Re: [PATCH 20/22] PCI Error Recovery: e100 network device driver

linas writes:
> On Mon, Oct 10, 2005 at 05:10:56PM -0700, Greg KH was heard to remark:
> > On Thu, Oct 06, 2005 at 06:57:29PM -0500, linas wrote:
> > > +config E100_EEH_RECOVERY
> > > + bool "Enable PCI bus error recovery"
> > > + depends on E100 && PPC_PSERIES
> > > + help
> > > + If you say Y here, the driver will be able to recover from
> > > + PCI bus errors on many PowerPC platforms. IBM pSeries users
> > > + should answer Y.
> >
> > Why make a config option for this at all? Who would turn it off?
>
> I wanted to have this turned off for anyone who didn't have
> hardware capable of supporting this, and didn't really think
> about how to hide this from the menu. I guess its best to
> just plain hide this, keep the menus from getting cluttered.

I would think we could have one config option to enable PCI bus error
recovery generally, and have the code in the drivers enabled by that.
I don't think we need an individual config option for each driver to
enable PCI error recovery.

Regards,
Paul.

2005-10-12 09:52:47

by Paul Mackerras

[permalink] [raw]
Subject: Re: [PATCH 15/22] ppc64: PCI Error Recovery: PPC64 core recovery routines

Linas writes:

> + /* We might not have a pci device, if it was a config space read
> + * that failed. Find the pci device now. */
> + if (!dev) {
> + while ((dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, dev)) != NULL) {
> + if (pci_device_to_OF_node(dev) == event->dn)
> + break;
> + }
> + }

Couldn't we just use PCI_DN(event->dn)->pcidev here? Is there some
reason why this would not work in some circumstances? It would be
nice to avoid this linear search.

Paul.

2005-10-13 16:04:12

by linas

[permalink] [raw]
Subject: Re: [PATCH 15/22] ppc64: PCI Error Recovery: PPC64 core recovery routines

On Wed, Oct 12, 2005 at 07:49:52PM +1000, Paul Mackerras was heard to remark:
> Linas writes:
>
> > + /* We might not have a pci device, if it was a config space read
> > + * that failed. Find the pci device now. */
> > + if (!dev) {
> > + while ((dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, dev)) != NULL) {
> > + if (pci_device_to_OF_node(dev) == event->dn)
> > + break;
> > + }
> > + }
>
> Couldn't we just use PCI_DN(event->dn)->pcidev here? Is there some
> reason why this would not work in some circumstances? It would be
> nice to avoid this linear search.

Funny tha you mention this, I just chopped this out yesterday; its cruft
left over from back-when.

The reason I chopped this out is due to a bug regarding the handling
of multi-function devices with the "new style" firmware interfaces.

With the new interfaces (i.e. those using ibm,get-config-addr-info),
every function on a pci card is labelled as a "partitionable endpoint".
By contrast, the current code assumes a "PE" is associated with a pci
card. As a result of this mismatch, the handling of multi-function
cards on systems with the new-style firmware is flubbed. (In particular,
the setup and use of config-space is muffed, resulting in crashes due
to access of i/o space that wasn't correctly set up).

I'm trying several different approaches to fixing this.

1) consolidating multiple pci functions (multiple PE's) into a single
"pci card" and treating the thing as a unit.

2) Treating each PE as completely distinct, and handling each distinctly.

Each approach seems to have problems. Cross my fingers, hope to have
something working later today; however, I'm irritated that I even need
to solve his problem.

--linas

2006-01-07 21:28:54

by Olaf Hering

[permalink] [raw]
Subject: Re: [PATCH 14/22] ppc64: RPA PHP to EEH code movement

On Thu, Oct 06, Linas Vepstas wrote:

>
> 14-rpaphp-migrate.patch
>
> This patch moves some pci device add & remove code from the PCI
> hotplug directory to the arch/ppc64/kernel directory, and cleans
> it up a tad. The primary reason for this is that the code performs
> some fairly generic operations that are shared with the PCI error
> recovery code (living in the arch/ppc64/kernel directory).

> +++ linux-2.6.14-rc2-git6/arch/ppc64/kernel/pci_dlpar.c 2005-10-06 17:54:00.306445890 -0500

> +pcibios_add_pci_devices(struct pci_bus * bus)

> + eeh_add_device_tree_early(dn);

eeh_add_device_tree_early is in eeh.c, which depends on CONFIG_EEH. but
pci_dlpar.c is compiled unconditionally. Current powerpc.git gives:

arch/powerpc/platforms/built-in.o(.text+0x99b8): In function `.pcibios_add_pci_devices':
: undefined reference to `.eeh_add_device_tree_early'
arch/powerpc/platforms/built-in.o(.text+0x9b40): In function `.pcibios_remove_pci_devices':
: undefined reference to `.eeh_remove_bus_device'


--
short story of a lazy sysadmin:
alias appserv=wotan

2006-01-09 19:59:05

by linas

[permalink] [raw]
Subject: [PATCH]: ppowerpc: fix compile-time failure when EEH disabled.


Paul, please apply and fwd upstream.

--linas

Patch to fix compile problem reported by Olaf Herring:
Kernel fails to compile when CONFIG_EMBEDDED is enabled,
but CONFIG_EEH disabled.

Signed-off-by: Linas Vepstas <[email protected]>

Index: linux-2.6.15-mm1/include/asm-powerpc/eeh.h
===================================================================
--- linux-2.6.15-mm1.orig/include/asm-powerpc/eeh.h 2006-01-09 12:23:39.698773976 -0600
+++ linux-2.6.15-mm1/include/asm-powerpc/eeh.h 2006-01-09 12:28:44.404818949 -0600
@@ -113,12 +113,11 @@
}

static inline void pci_addr_cache_build(void) { }
-
static inline void eeh_add_device_early(struct device_node *dn) { }
-
static inline void eeh_add_device_late(struct pci_dev *dev) { }
-
static inline void eeh_remove_device(struct pci_dev *dev) { }
+static inline void eeh_remove_bus_device(struct pci_dev *dev) { }
+static inline void eeh_add_device_tree_early(struct device_node *dn) { }

#define EEH_POSSIBLE_ERROR(val, type) (0)
#define EEH_IO_ERROR_VALUE(size) (-1UL)