2013-05-22 19:17:11

by Suravee Suthikulpanit

[permalink] [raw]
Subject: [PATCH 0/3] iommu/amd: IOMMU Error Reporting/Handling/Filtering

From: Suravee Suthikulpanit <[email protected]>

This patch set implements framework for handling errors reported via IOMMU
event log. It also implements mechanism to filter/suppress error messages when
IOMMU hardware generates large amount event logs, which is often caused by
devices performing invalid operations or from misconfiguring IOMMU hardware
(e.g. IO_PAGE_FAULT and INVALID_DEVICE_QEQUEST").

DEVICE vs IOMMU ERRORS:
=======================
Event types in AMD IOMMU event log can be categorized as:
- IOMMU error : An error which is specific to IOMMU hardware
- Device error: An error which is specific to a device
- Non-error : Miscelleneous events which are not classified as errors.
This patch set implements frameworks for handling "IOMMU error" and "device error".
For IOMMU error, the driver will log the event in dmesg and panic since the IOMMU
hardware is no longer functioning. For device error, the driver will decode and
log the error in dmesg based on the error logging level specified at boot time.

ERROR LOGGING LEVEL:
====================
The filtering framework introduce 3 levels of event logging,
"AMD_IOMMU_LOG_[DEFAULT|VERBOSE|DEBUG]". Users can specify the level
via a new boot option "amd_iommu_log=[default|verbose|debug]".
- default: Each error message is truncated. Filtering is enabled.
- verbose: Output detail error message. Filtering is enabled.
- debug : Output detail error message. Filtering is disabled.

ERROR THRESHOLD LEVEL:
======================
Error threshold is used by the log filtering logic to determine when to suppress
the errors from a particular device. The threshold is defined as "the number of errors
(X) over a specified period (Y sec)". When the threshold is reached, IOMMU driver will
suppress subsequent error messages from the device for a predefined period (Z sec).
X, Y, and Z is currently hard-coded to 10 errors, 5 sec, and 30 sec.

DATA STRUCTURE:
===============
A new structure "struct dte_err_info" is added. It contains error information
specific to each device table entry (DTE). The structure is allocated dynamically
per DTE when IOMMU driver handle device error for the first time.

ERROR STATES and LOG FILTERING:
============================================
The filtering framework define 3 device error states "NONE", "PROBATION" and "SUPPRESS".
1. From IOMMU driver intialization, all devices are in DEV_ERR_NONE state.
2. During interupt handling, IOMMU driver processes each entry in the event log.
3. If an entry is device error, the driver tags DTE with DEV_ERR_PROBATION and
report error via via dmesg.
4. For non-debug mode, if the device threshold is reached, the device is moved into
DEV_ERR_SUPPRESS state in which all error messages are suppressed.
5. After the suppress period has passed, the driver put the device in probation state,
and errors are reported once again. If the device continues to generate errors,
it will be re-suppress once the next threshold is reached.

EXAMPLE OUTPUT:
===============
AMD-Vi: Event=IO_PAGE_FAULT dev=3:0.0 dom=0x1b addr=0x97040 flg=N Ex Sup M P W Pm Ill Ta
AMD-Vi: Event=IO_PAGE_FAULT dev=3:0.0 dom=0x1b addr=0x97070 flg=N Ex Sup M P W Pm Ill Ta
AMD-Vi: Event=IO_PAGE_FAULT dev=3:0.0 dom=0x1b addr=0x97060 flg=N Ex Sup M P W Pm Ill Ta
AMD-Vi: Event=IO_PAGE_FAULT dev=3:0.0 dom=0x1b addr=0x4970 flg=N Ex Sup M P W Pm Ill Ta
AMD-Vi: Event=IO_PAGE_FAULT dev=3:0.0 dom=0x1b addr=0x98840 flg=N Ex Sup M P W Pm Ill Ta
AMD-Vi: Event=IO_PAGE_FAULT dev=3:0.0 dom=0x1b addr=0x98870 flg=N Ex Sup M P W Pm Ill Ta
AMD-Vi: Event=IO_PAGE_FAULT dev=3:0.0 dom=0x1b addr=0x98860 flg=N Ex Sup M P W Pm Ill Ta
AMD-Vi: Event=IO_PAGE_FAULT dev=3:0.0 dom=0x1b addr=0x4980 flg=N Ex Sup M P W Pm Ill Ta
AMD-Vi: Event=IO_PAGE_FAULT dev=3:0.0 dom=0x1b addr=0x99040 flg=N Ex Sup M P W Pm Ill Ta
AMD-Vi: Event=IO_PAGE_FAULT dev=3:0.0 dom=0x1b addr=0x99060 flg=N Ex Sup M P W Pm Ill Ta
AMD-Vi: Warning: IOMMU error threshold (10) reached for device=3:0.0. Suppress for 30 secs.!!!

Suravee Suthikulpanit (3):
iommu/amd: Adding amd_iommu_log cmdline option
iommu/amd: Add error handling/reporting/filtering logic
iommu/amd: Remove old event printing logic

Documentation/kernel-parameters.txt | 10 +
drivers/iommu/Makefile | 2 +-
drivers/iommu/amd_iommu.c | 85 +-------
drivers/iommu/amd_iommu_fault.c | 368 +++++++++++++++++++++++++++++++++++
drivers/iommu/amd_iommu_init.c | 19 ++
drivers/iommu/amd_iommu_proto.h | 6 +
drivers/iommu/amd_iommu_types.h | 16 ++
7 files changed, 426 insertions(+), 80 deletions(-)
create mode 100644 drivers/iommu/amd_iommu_fault.c

--
1.7.10.4


2013-05-22 19:16:42

by Suravee Suthikulpanit

[permalink] [raw]
Subject: [PATCH 1/3] iommu/amd: Adding amd_iommu_log cmdline option

From: Suravee Suthikulpanit <[email protected]>

Adding amd_iommu_log command line option to allow "default", "verbose" and "debug"
IOMMU error logging level in kernel log.

Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
Documentation/kernel-parameters.txt | 10 ++++++++++
drivers/iommu/amd_iommu_init.c | 17 +++++++++++++++++
drivers/iommu/amd_iommu_types.h | 6 ++++++
3 files changed, 33 insertions(+)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index c3bfacb..752f0f9 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -350,6 +350,16 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
driver will print ACPI tables for AMD IOMMU during
IOMMU initialization.

+ amd_iommu_log= [HW,X86-64]
+ Specify parametes to choose the logging level. By default
+ the AMD IOMMU minimizes the logging detail, filters
+ duplicate log entries and suppress logging when encounters
+ storm of interrupts from a particular device.
+ Available options are:
+ default - Default log level.
+ verbose - Output detail log messages.
+ debug - Output detail log messages and no filter/suppress.
+
amijoy.map= [HW,JOY] Amiga joystick support
Map of devices attached to JOY0DAT and JOY1DAT
Format: <a>,<b>
diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index bf51abb..66e3722 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -157,6 +157,8 @@ bool amd_iommu_v2_present __read_mostly;

bool amd_iommu_force_isolation __read_mostly;

+int amd_iommu_log_level __read_mostly = AMD_IOMMU_LOG_DEFAULT;
+
/*
* List of protection domains - used during resume
*/
@@ -2157,6 +2159,20 @@ static int __init parse_amd_iommu_options(char *str)
return 1;
}

+static int __init parse_amd_iommu_log(char *str)
+{
+ for (; *str; ++str) {
+ if (strncmp(str, "default", 7) == 0)
+ amd_iommu_log_level = AMD_IOMMU_LOG_DEFAULT;
+ else if (strncmp(str, "verbose", 7) == 0)
+ amd_iommu_log_level = AMD_IOMMU_LOG_VERBOSE;
+ else if (strncmp(str, "debug", 5) == 0)
+ amd_iommu_log_level = AMD_IOMMU_LOG_DEBUG;
+ }
+
+ return 1;
+}
+
static int __init parse_ivrs_ioapic(char *str)
{
unsigned int bus, dev, fn;
@@ -2219,6 +2235,7 @@ static int __init parse_ivrs_hpet(char *str)

__setup("amd_iommu_dump", parse_amd_iommu_dump);
__setup("amd_iommu=", parse_amd_iommu_options);
+__setup("amd_iommu_log=", parse_amd_iommu_log);
__setup("ivrs_ioapic", parse_ivrs_ioapic);
__setup("ivrs_hpet", parse_ivrs_hpet);

diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index 0285a21..85b7a65 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -694,6 +694,12 @@ extern bool amd_iommu_v2_present;

extern bool amd_iommu_force_isolation;

+#define AMD_IOMMU_LOG_DEFAULT 0
+#define AMD_IOMMU_LOG_VERBOSE 1
+#define AMD_IOMMU_LOG_DEBUG 2
+
+extern int amd_iommu_log_level;
+
/* Max levels of glxval supported */
extern int amd_iommu_max_glx_val;

--
1.7.10.4

2013-05-22 19:16:55

by Suravee Suthikulpanit

[permalink] [raw]
Subject: [PATCH 2/3] iommu/amd: Add error handling/reporting/filtering logic

From: Suravee Suthikulpanit <[email protected]>

Add error handling/reporting/filtering logic which uses the user-specified
amd_iommu_log option to determine the log reporting behavior.

Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
drivers/iommu/Makefile | 2 +-
drivers/iommu/amd_iommu_fault.c | 368 +++++++++++++++++++++++++++++++++++++++
drivers/iommu/amd_iommu_init.c | 2 +
drivers/iommu/amd_iommu_proto.h | 6 +
drivers/iommu/amd_iommu_types.h | 10 ++
5 files changed, 387 insertions(+), 1 deletion(-)
create mode 100644 drivers/iommu/amd_iommu_fault.c

diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index ef0e520..b18da7c 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -1,7 +1,7 @@
obj-$(CONFIG_IOMMU_API) += iommu.o
obj-$(CONFIG_OF_IOMMU) += of_iommu.o
obj-$(CONFIG_MSM_IOMMU) += msm_iommu.o msm_iommu_dev.o
-obj-$(CONFIG_AMD_IOMMU) += amd_iommu.o amd_iommu_init.o
+obj-$(CONFIG_AMD_IOMMU) += amd_iommu.o amd_iommu_init.o amd_iommu_fault.o
obj-$(CONFIG_AMD_IOMMU_V2) += amd_iommu_v2.o
obj-$(CONFIG_DMAR_TABLE) += dmar.o
obj-$(CONFIG_INTEL_IOMMU) += iova.o intel-iommu.o
diff --git a/drivers/iommu/amd_iommu_fault.c b/drivers/iommu/amd_iommu_fault.c
new file mode 100644
index 0000000..0bf380d
--- /dev/null
+++ b/drivers/iommu/amd_iommu_fault.c
@@ -0,0 +1,368 @@
+/*
+ * Copyright (C) 2013 Advanced Micro Devices, Inc.
+ * Author: Suravee Suthikulpanit <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+
+#include <linux/iommu.h>
+#include <linux/module.h>
+#include <linux/time.h>
+
+#include "amd_iommu_proto.h"
+#include "amd_iommu_types.h"
+
+#define AMD_IOMMU_ERR_THRESHOLD 10
+#define AMD_IOMMU_ERR_PERIOD_SEC 5
+#define AMD_IOMMU_SUP_PERIOD_SEC 30
+
+enum dev_error_state {
+ DEV_ERR_NONE,
+ DEV_ERR_PROBATION,
+ DEV_ERR_SUPPRESS,
+};
+
+struct dte_err_info {
+ struct list_head list;
+ struct amd_iommu *iommu;
+ u16 devid;
+ enum dev_error_state state;
+ u32 err_cnt;
+ unsigned long last_err_sec;
+ unsigned long suppress_sec;
+};
+
+struct _iommu_event_flags {
+ u32 gn:1, /* 16 */
+ nx:1, /* 17 */
+ us:1, /* 18 */
+ i:1, /* 19 */
+ pr:1, /* 20 */
+ rw:1, /* 21 */
+ pe:1, /* 22 */
+ rz:1, /* 23 */
+ tr:1, /* 24 */
+ type:3, /* [27:25] */
+ _reserved_:20; /* Reserved */
+};
+
+static const char * const _type_field_encodings[] = {
+ "Reserved", /* 00 */
+ "Master Abort", /* 01 */
+ "Target Abort", /* 10 */
+ "Data Error", /* 11 */
+};
+
+static const char * const _invalid_trnsac_desc[] = {
+ "Read request or non-posted write in the interrupt "
+ "addres range", /* 000 */
+ "Pretranslated transaction received from an "
+ "I/O device that has I=0 or V=0 in DTE", /* 001 */
+ "Port I/O space transaction received from an "
+ "I/O device that has IoCtl=00b in DTE", /* 010 */
+ "Posted write to invalid address range", /* 011 */
+ "Invalid read request or non-posted write", /* 100 */
+ "Posted write to the interrupt/EOI range from an "
+ "I/O device that has IntCtl=00b in DTE", /* 101 */
+ "Posted write to a reserved interrupt address range", /* 110 */
+ "Invalid transaction to the system management "
+ "address range", /* 111 */
+};
+
+static const char * const _invalid_trnslt_desc[] = {
+ "Translation request received from an I/O device "
+ "that has I=0, or has V=0, or has V=1 and "
+ "TV=0 in DTE", /* 000 */
+ "Translation request in invalid address range", /* 001 */
+ "Invalid translation request", /* 010 */
+ "Reserved", /* 011 */
+ "Reserved", /* 100 */
+ "Reserved", /* 101 */
+ "Reserved", /* 110 */
+ "Reserved", /* 111 */
+};
+
+static void dump_detail_error(struct _iommu_event_flags *p, int ev_type)
+{
+ u32 err_type = p->type;
+
+ pr_err("AMD-Vi: Error type details: (0x%x) ", err_type);
+ if ((ev_type == EVENT_TYPE_DEV_TAB_ERR) ||
+ (ev_type == EVENT_TYPE_PAGE_TAB_ERR) ||
+ (ev_type == EVENT_TYPE_CMD_HARD_ERR)) {
+ if (err_type < ARRAY_SIZE(_type_field_encodings)) {
+ pr_cont("%s\n",
+ _type_field_encodings[err_type]);
+ }
+ } else if (ev_type == EVENT_TYPE_INV_DEV_REQ) {
+ if (p->tr == 0) {
+ if (err_type < ARRAY_SIZE(_invalid_trnslt_desc))
+ pr_cont("%s\n",
+ _invalid_trnslt_desc[err_type]);
+ } else {
+ if (err_type < ARRAY_SIZE(_invalid_trnsac_desc))
+ pr_cont("%s\n",
+ _invalid_trnsac_desc[err_type]);
+ }
+ }
+}
+
+static void dump_flags(int flags, int ev_type)
+{
+ struct _iommu_event_flags *p = (struct _iommu_event_flags *) &flags;
+ u32 err_type = p->type;
+
+ pr_cont(" flg=%s %s %s %s %s %s %s %s %s",
+ (p->gn ? "G" : "N"),
+ (p->nx ? "Nx" : "Ex"),
+ (p->us ? "Usr" : "Sup"),
+ (p->i ? "I" : "M"),
+ (p->pr ? "P" : "NP"),
+ (p->rw ? "W" : "R"),
+ (p->pe ? "NPm" : "Pm"),
+ (p->rz ? "Rsv" : "Ill"),
+ (p->tr ? "Tl" : "Ta"));
+
+ /* Error type only needed for certain events */
+ if (amd_iommu_log_level < AMD_IOMMU_LOG_VERBOSE) {
+ if ((ev_type == EVENT_TYPE_DEV_TAB_ERR) ||
+ (ev_type == EVENT_TYPE_PAGE_TAB_ERR) ||
+ (ev_type == EVENT_TYPE_CMD_HARD_ERR) ||
+ (ev_type == EVENT_TYPE_INV_DEV_REQ))
+ pr_cont(" (0x%x)\n", err_type);
+ } else {
+ pr_cont("\n");
+ dump_detail_error(p, ev_type);
+ }
+}
+
+static void dump_dte_entry(u16 devid)
+{
+ if (amd_iommu_log_level < AMD_IOMMU_LOG_VERBOSE)
+ return;
+
+ pr_err("AMD-Vi: DTE[0:3]:%016llx %016llx %016llx %016llx\n",
+ amd_iommu_dev_table[devid].data[0],
+ amd_iommu_dev_table[devid].data[1],
+ amd_iommu_dev_table[devid].data[2],
+ amd_iommu_dev_table[devid].data[3]);
+}
+
+static void dump_command(unsigned long phys_addr)
+{
+ struct iommu_cmd *cmd = phys_to_virt(phys_addr);
+
+ if (amd_iommu_log_level < AMD_IOMMU_LOG_VERBOSE)
+ return;
+
+ pr_err("AMD-Vi: CMD[0:3]:%08x %08x %08x %08x\n",
+ cmd->data[0], cmd->data[1], cmd->data[2], cmd->data[3]);
+}
+
+void amd_iommu_print_event(int type, int devid, int domid,
+ int flags, u64 address)
+{
+ pr_err("AMD-Vi: Event=");
+
+ switch (type) {
+ case EVENT_TYPE_ILL_DEV:
+ pr_cont("ILLEGAL_DEV_TBL_ENTRY dev=%x:%x.%x addr=0x%llx",
+ PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
+ address);
+ dump_flags(flags, type);
+ dump_dte_entry(devid);
+ break;
+ case EVENT_TYPE_IO_FAULT:
+ pr_cont("IO_PAGE_FAULT dev=%x:%x.%x dom=0x%x addr=0x%llx",
+ PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
+ domid, address);
+ dump_flags(flags, type);
+ dump_dte_entry(devid);
+ break;
+ case EVENT_TYPE_DEV_TAB_ERR:
+ pr_cont("DEV_TAB_HW_ERR dev=%x:%x.%x addr=0x%llx",
+ PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
+ address);
+ dump_flags(flags, type);
+ break;
+ case EVENT_TYPE_PAGE_TAB_ERR:
+ pr_cont("PAGE_TAB_HW_ERR dev=%x:%x.%x dom=0x%4x addr=0x%llx",
+ PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
+ domid, address);
+ dump_flags(flags, type);
+ break;
+ case EVENT_TYPE_ILL_CMD:
+ pr_cont("ILLEGAL_CMD_ERR addr=0x%llx\n",
+ address);
+ dump_command(address);
+ break;
+ case EVENT_TYPE_CMD_HARD_ERR:
+ pr_cont("CMD_HW_ERR addr=0x%llx",
+ address);
+ dump_flags(flags, type);
+ break;
+ case EVENT_TYPE_IOTLB_INV_TO:
+ pr_cont("IOTLB_INV_TIMEOUT dev=%x:%x.%x addr=0x%llx\n",
+ PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
+ address);
+ break;
+ case EVENT_TYPE_INV_DEV_REQ:
+ pr_cont("INVALID_DEVICE_REQUEST dev=%x:%x.%x addr=0x%llx",
+ PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
+ address);
+ dump_flags(flags, type);
+ dump_dte_entry(devid);
+ break;
+ default:
+ pr_cont("UNKNOWN type=0x%x\n", type);
+ }
+}
+
+LIST_HEAD(amd_dte_err_list); /* list of all DTE in probation state */
+
+void amd_iommu_clear_all_dev_faults(void)
+{
+ struct dte_err_info *entry, *next;
+ list_for_each_entry_safe(entry, next, &amd_dte_err_list, list) {
+ list_del(&entry->list);
+ kfree(entry);
+ }
+}
+
+static struct dte_err_info *_get_dte_error_info(struct amd_iommu *iommu,
+ u16 devid)
+{
+ struct dte_err_info *entry;
+
+ list_for_each_entry(entry, &amd_dte_err_list, list) {
+ if (entry->devid != devid)
+ continue;
+ return entry;
+ }
+
+ entry = kzalloc(sizeof(*entry), GFP_KERNEL);
+ if (!entry)
+ return entry;
+
+ entry->iommu = iommu;
+ entry->devid = devid;
+ entry->state = DEV_ERR_NONE;
+ entry->err_cnt = 0;
+ entry->last_err_sec = get_seconds();
+ entry->suppress_sec = 0;
+ list_add(&entry->list, &amd_dte_err_list);
+
+ return entry;
+}
+
+static bool _check_suppress_device(struct dte_err_info *derr)
+{
+ bool ret = false;
+
+ /* Check if the error period is reached */
+ if ((get_seconds() - derr->last_err_sec) > AMD_IOMMU_ERR_PERIOD_SEC) {
+ /* Reset err,int counters */
+ derr->err_cnt = 0;
+ return ret;
+ }
+
+ /* We suppress device error log if the number of errors reaches
+ * the ERR_THRESHOLD within x sec.
+ */
+ if (derr->err_cnt >= AMD_IOMMU_ERR_THRESHOLD)
+ ret = true;
+
+ return ret;
+}
+
+static int _iommu_handle_dev_fault(struct amd_iommu *iommu, u64 address,
+ int type, int flags, int devid, int domid)
+{
+ int ret = 0;
+ struct dte_err_info *derr = _get_dte_error_info(iommu, devid);
+
+ derr->err_cnt++;
+
+ /* For debug mode, we don't do any filtering */
+ if (amd_iommu_log_level == AMD_IOMMU_LOG_DEBUG) {
+ amd_iommu_print_event(type, devid, domid, flags, address);
+ return ret;
+ }
+
+ if (derr->state == DEV_ERR_SUPPRESS) {
+ /* Check if the suppress period has passed */
+ if ((get_seconds() - derr->suppress_sec >=
+ AMD_IOMMU_SUP_PERIOD_SEC)) {
+ derr->state = DEV_ERR_PROBATION;
+ derr->suppress_sec = 0;
+ derr->err_cnt = 1;
+ derr->last_err_sec = get_seconds();
+ amd_iommu_print_event(type, devid, domid,
+ flags, address);
+ }
+ } else {
+ amd_iommu_print_event(type, devid, domid, flags, address);
+ if (_check_suppress_device(derr)) {
+ pr_err("AMD-Vi: Warning: IOMMU error threshold (%u) "
+ "reached for device=%x:%x.%x. Suppress for "
+ "%d secs.!!!\n",
+ derr->err_cnt,
+ PCI_BUS_NUM(derr->devid),
+ PCI_SLOT(derr->devid),
+ PCI_FUNC(derr->devid),
+ AMD_IOMMU_SUP_PERIOD_SEC);
+
+ derr->state = DEV_ERR_SUPPRESS;
+ derr->suppress_sec = get_seconds();
+ }
+ }
+
+ return ret;
+}
+
+int amd_iommu_handle_fault(struct amd_iommu *iommu,
+ u64 address, int type, int flags,
+ int devid, int domid)
+{
+ int ret = -EINVAL;
+
+ switch (type) {
+ /* Events which report device specific errors */
+ case EVENT_TYPE_IO_FAULT:
+ case EVENT_TYPE_ILL_DEV:
+ case EVENT_TYPE_INV_DEV_REQ:
+ case EVENT_TYPE_INV_PPR_REQ:
+ ret = _iommu_handle_dev_fault(iommu,
+ address, type, flags, devid, domid);
+ break;
+ /* Events which report commands errors */
+ case EVENT_TYPE_ILL_CMD:
+ panic("AMD-Vi: Illegal IOMMU command. This has caused"
+ "IOMMU to stop functioning.\n");
+ break;
+ /* Events which report hardware errors */
+ case EVENT_TYPE_DEV_TAB_ERR:
+ case EVENT_TYPE_PAGE_TAB_ERR:
+ case EVENT_TYPE_CMD_HARD_ERR:
+ {
+ amd_iommu_print_event(type, devid, domid, flags, address);
+ panic("AMD-Vi: IOMMU hardware error.\n");
+ break;
+ }
+ default:
+ break;
+ }
+
+ return ret;
+}
diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 66e3722..145d6ab 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -1682,6 +1682,8 @@ static void __init free_on_init_error(void)
gart_iommu_init();

#endif
+
+ amd_iommu_clear_all_dev_faults();
}

/* SB IOAPIC is always on this device in AMD systems */
diff --git a/drivers/iommu/amd_iommu_proto.h b/drivers/iommu/amd_iommu_proto.h
index c294961..eab830d 100644
--- a/drivers/iommu/amd_iommu_proto.h
+++ b/drivers/iommu/amd_iommu_proto.h
@@ -63,6 +63,12 @@ extern struct iommu_domain *amd_iommu_get_v2_domain(struct pci_dev *pdev);
extern int amd_iommu_complete_ppr(struct pci_dev *pdev, int pasid,
int status, int tag);

+extern int amd_iommu_handle_fault(struct amd_iommu *iommu,
+ u64 address, int type, int flags,
+ int devid, int domid);
+
+extern void amd_iommu_clear_all_dev_faults(void);
+
#ifndef CONFIG_AMD_IOMMU_STATS

static inline void amd_iommu_stats_init(void) { }
diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index 85b7a65..982411d 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -116,6 +116,7 @@
#define EVENT_TYPE_CMD_HARD_ERR 0x6
#define EVENT_TYPE_IOTLB_INV_TO 0x7
#define EVENT_TYPE_INV_DEV_REQ 0x8
+#define EVENT_TYPE_INV_PPR_REQ 0x9
#define EVENT_DEVID_MASK 0xffff
#define EVENT_DEVID_SHIFT 0
#define EVENT_DOMID_MASK 0xffff
@@ -172,6 +173,7 @@
/* macros and definitions for device table entries */
#define DEV_ENTRY_VALID 0x00
#define DEV_ENTRY_TRANSLATION 0x01
+#define DEV_ENTRY_GUEST_TRANSLATION 0x37
#define DEV_ENTRY_IR 0x3d
#define DEV_ENTRY_IW 0x3e
#define DEV_ENTRY_NO_PAGE_FAULT 0x62
@@ -179,6 +181,7 @@
#define DEV_ENTRY_SYSMGT1 0x68
#define DEV_ENTRY_SYSMGT2 0x69
#define DEV_ENTRY_IRQ_TBL_EN 0x80
+#define DEV_ENTRY_IG 0x85
#define DEV_ENTRY_INIT_PASS 0xb8
#define DEV_ENTRY_EINT_PASS 0xb9
#define DEV_ENTRY_NMI_PASS 0xba
@@ -626,6 +629,13 @@ struct dev_table_entry {
};

/*
+ * general struct to manage commands send to an IOMMU
+ */
+struct iommu_cmd {
+ u32 data[4];
+};
+
+/*
* One entry for unity mappings parsed out of the ACPI table.
*/
struct unity_map_entry {
--
1.7.10.4

2013-05-22 19:16:58

by Suravee Suthikulpanit

[permalink] [raw]
Subject: [PATCH 3/3] iommu/amd: Remove old event printing logic

From: Suravee Suthikulpanit <[email protected]>

Remove old event printing logic and hook up with the new event
handling logic in amd_iommu_fault.c

Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
drivers/iommu/amd_iommu.c | 85 ++++-----------------------------------------
1 file changed, 6 insertions(+), 79 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 21d02b0..44032d4 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -88,13 +88,6 @@ int amd_iommu_max_glx_val = -1;

static struct dma_map_ops amd_iommu_dma_ops;

-/*
- * general struct to manage commands send to an IOMMU
- */
-struct iommu_cmd {
- u32 data[4];
-};
-
struct kmem_cache *amd_iommu_irq_cache;

static void update_domain(struct protection_domain *domain);
@@ -596,28 +589,10 @@ static void amd_iommu_stats_init(void)
*
****************************************************************************/

-static void dump_dte_entry(u16 devid)
-{
- int i;
-
- for (i = 0; i < 4; ++i)
- pr_err("AMD-Vi: DTE[%d]: %016llx\n", i,
- amd_iommu_dev_table[devid].data[i]);
-}
-
-static void dump_command(unsigned long phys_addr)
-{
- struct iommu_cmd *cmd = phys_to_virt(phys_addr);
- int i;
-
- for (i = 0; i < 4; ++i)
- pr_err("AMD-Vi: CMD[%d]: %08x\n", i, cmd->data[i]);
-}
-
-static void iommu_print_event(struct amd_iommu *iommu, void *__evt)
+static void iommu_handle_event(struct amd_iommu *iommu, void *__evt)
{
int type, devid, domid, flags;
- volatile u32 *event = __evt;
+ u32 *event = __evt;
int count = 0;
u64 address;

@@ -638,57 +613,7 @@ retry:
goto retry;
}

- printk(KERN_ERR "AMD-Vi: Event logged [");
-
- switch (type) {
- case EVENT_TYPE_ILL_DEV:
- printk("ILLEGAL_DEV_TABLE_ENTRY device=%02x:%02x.%x "
- "address=0x%016llx flags=0x%04x]\n",
- PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
- address, flags);
- dump_dte_entry(devid);
- break;
- case EVENT_TYPE_IO_FAULT:
- printk("IO_PAGE_FAULT device=%02x:%02x.%x "
- "domain=0x%04x address=0x%016llx flags=0x%04x]\n",
- PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
- domid, address, flags);
- break;
- case EVENT_TYPE_DEV_TAB_ERR:
- printk("DEV_TAB_HARDWARE_ERROR device=%02x:%02x.%x "
- "address=0x%016llx flags=0x%04x]\n",
- PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
- address, flags);
- break;
- case EVENT_TYPE_PAGE_TAB_ERR:
- printk("PAGE_TAB_HARDWARE_ERROR device=%02x:%02x.%x "
- "domain=0x%04x address=0x%016llx flags=0x%04x]\n",
- PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
- domid, address, flags);
- break;
- case EVENT_TYPE_ILL_CMD:
- printk("ILLEGAL_COMMAND_ERROR address=0x%016llx]\n", address);
- dump_command(address);
- break;
- case EVENT_TYPE_CMD_HARD_ERR:
- printk("COMMAND_HARDWARE_ERROR address=0x%016llx "
- "flags=0x%04x]\n", address, flags);
- break;
- case EVENT_TYPE_IOTLB_INV_TO:
- printk("IOTLB_INV_TIMEOUT device=%02x:%02x.%x "
- "address=0x%016llx]\n",
- PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
- address);
- break;
- case EVENT_TYPE_INV_DEV_REQ:
- printk("INVALID_DEVICE_REQUEST device=%02x:%02x.%x "
- "address=0x%016llx flags=0x%04x]\n",
- PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
- address, flags);
- break;
- default:
- printk(KERN_ERR "UNKNOWN type=0x%02x]\n", type);
- }
+ amd_iommu_handle_fault(iommu, address, type, flags, devid, domid);

memset(__evt, 0, 4 * sizeof(u32));
}
@@ -701,7 +626,7 @@ static void iommu_poll_events(struct amd_iommu *iommu)
tail = readl(iommu->mmio_base + MMIO_EVT_TAIL_OFFSET);

while (head != tail) {
- iommu_print_event(iommu, iommu->evt_buf + head);
+ iommu_handle_event(iommu, iommu->evt_buf + head);
head = (head + EVENT_ENTRY_SIZE) % iommu->evt_buf_size;
}

@@ -814,6 +739,7 @@ irqreturn_t amd_iommu_int_thread(int irq, void *data)
*/
status = readl(iommu->mmio_base + MMIO_STATUS_OFFSET);
}
+
return IRQ_HANDLED;
}

@@ -3248,6 +3174,7 @@ static int __init alloc_passthrough_domain(void)

return 0;
}
+
static int amd_iommu_domain_init(struct iommu_domain *dom)
{
struct protection_domain *domain;
--
1.7.10.4

2013-06-04 05:27:39

by Suravee Suthikulpanit

[permalink] [raw]
Subject: Re: [PATCH 0/3] iommu/amd: IOMMU Error Reporting/Handling/Filtering

Ping

On 5/22/2013 2:15 PM, [email protected] wrote:
> From: Suravee Suthikulpanit <[email protected]>
>
> This patch set implements framework for handling errors reported via IOMMU
> event log. It also implements mechanism to filter/suppress error messages when
> IOMMU hardware generates large amount event logs, which is often caused by
> devices performing invalid operations or from misconfiguring IOMMU hardware
> (e.g. IO_PAGE_FAULT and INVALID_DEVICE_QEQUEST").
>
> DEVICE vs IOMMU ERRORS:
> =======================
> Event types in AMD IOMMU event log can be categorized as:
> - IOMMU error : An error which is specific to IOMMU hardware
> - Device error: An error which is specific to a device
> - Non-error : Miscelleneous events which are not classified as errors.
> This patch set implements frameworks for handling "IOMMU error" and "device error".
> For IOMMU error, the driver will log the event in dmesg and panic since the IOMMU
> hardware is no longer functioning. For device error, the driver will decode and
> log the error in dmesg based on the error logging level specified at boot time.
>
> ERROR LOGGING LEVEL:
> ====================
> The filtering framework introduce 3 levels of event logging,
> "AMD_IOMMU_LOG_[DEFAULT|VERBOSE|DEBUG]". Users can specify the level
> via a new boot option "amd_iommu_log=[default|verbose|debug]".
> - default: Each error message is truncated. Filtering is enabled.
> - verbose: Output detail error message. Filtering is enabled.
> - debug : Output detail error message. Filtering is disabled.
>
> ERROR THRESHOLD LEVEL:
> ======================
> Error threshold is used by the log filtering logic to determine when to suppress
> the errors from a particular device. The threshold is defined as "the number of errors
> (X) over a specified period (Y sec)". When the threshold is reached, IOMMU driver will
> suppress subsequent error messages from the device for a predefined period (Z sec).
> X, Y, and Z is currently hard-coded to 10 errors, 5 sec, and 30 sec.
>
> DATA STRUCTURE:
> ===============
> A new structure "struct dte_err_info" is added. It contains error information
> specific to each device table entry (DTE). The structure is allocated dynamically
> per DTE when IOMMU driver handle device error for the first time.
>
> ERROR STATES and LOG FILTERING:
> ============================================
> The filtering framework define 3 device error states "NONE", "PROBATION" and "SUPPRESS".
> 1. From IOMMU driver intialization, all devices are in DEV_ERR_NONE state.
> 2. During interupt handling, IOMMU driver processes each entry in the event log.
> 3. If an entry is device error, the driver tags DTE with DEV_ERR_PROBATION and
> report error via via dmesg.
> 4. For non-debug mode, if the device threshold is reached, the device is moved into
> DEV_ERR_SUPPRESS state in which all error messages are suppressed.
> 5. After the suppress period has passed, the driver put the device in probation state,
> and errors are reported once again. If the device continues to generate errors,
> it will be re-suppress once the next threshold is reached.
>
> EXAMPLE OUTPUT:
> ===============
> AMD-Vi: Event=IO_PAGE_FAULT dev=3:0.0 dom=0x1b addr=0x97040 flg=N Ex Sup M P W Pm Ill Ta
> AMD-Vi: Event=IO_PAGE_FAULT dev=3:0.0 dom=0x1b addr=0x97070 flg=N Ex Sup M P W Pm Ill Ta
> AMD-Vi: Event=IO_PAGE_FAULT dev=3:0.0 dom=0x1b addr=0x97060 flg=N Ex Sup M P W Pm Ill Ta
> AMD-Vi: Event=IO_PAGE_FAULT dev=3:0.0 dom=0x1b addr=0x4970 flg=N Ex Sup M P W Pm Ill Ta
> AMD-Vi: Event=IO_PAGE_FAULT dev=3:0.0 dom=0x1b addr=0x98840 flg=N Ex Sup M P W Pm Ill Ta
> AMD-Vi: Event=IO_PAGE_FAULT dev=3:0.0 dom=0x1b addr=0x98870 flg=N Ex Sup M P W Pm Ill Ta
> AMD-Vi: Event=IO_PAGE_FAULT dev=3:0.0 dom=0x1b addr=0x98860 flg=N Ex Sup M P W Pm Ill Ta
> AMD-Vi: Event=IO_PAGE_FAULT dev=3:0.0 dom=0x1b addr=0x4980 flg=N Ex Sup M P W Pm Ill Ta
> AMD-Vi: Event=IO_PAGE_FAULT dev=3:0.0 dom=0x1b addr=0x99040 flg=N Ex Sup M P W Pm Ill Ta
> AMD-Vi: Event=IO_PAGE_FAULT dev=3:0.0 dom=0x1b addr=0x99060 flg=N Ex Sup M P W Pm Ill Ta
> AMD-Vi: Warning: IOMMU error threshold (10) reached for device=3:0.0. Suppress for 30 secs.!!!
>
> Suravee Suthikulpanit (3):
> iommu/amd: Adding amd_iommu_log cmdline option
> iommu/amd: Add error handling/reporting/filtering logic
> iommu/amd: Remove old event printing logic
>
> Documentation/kernel-parameters.txt | 10 +
> drivers/iommu/Makefile | 2 +-
> drivers/iommu/amd_iommu.c | 85 +-------
> drivers/iommu/amd_iommu_fault.c | 368 +++++++++++++++++++++++++++++++++++
> drivers/iommu/amd_iommu_init.c | 19 ++
> drivers/iommu/amd_iommu_proto.h | 6 +
> drivers/iommu/amd_iommu_types.h | 16 ++
> 7 files changed, 426 insertions(+), 80 deletions(-)
> create mode 100644 drivers/iommu/amd_iommu_fault.c
>

2013-06-21 15:15:05

by Joerg Roedel

[permalink] [raw]
Subject: Re: [PATCH 0/3] iommu/amd: IOMMU Error Reporting/Handling/Filtering

On Wed, May 22, 2013 at 02:15:52PM -0500, Suthikulpanit, Suravee wrote:
> From: Suravee Suthikulpanit <[email protected]>
>
> This patch set implements framework for handling errors reported via IOMMU
> event log. It also implements mechanism to filter/suppress error messages when
> IOMMU hardware generates large amount event logs, which is often caused by
> devices performing invalid operations or from misconfiguring IOMMU hardware
> (e.g. IO_PAGE_FAULT and INVALID_DEVICE_QEQUEST").

Instead of extending this bad scaling dmesg error-reporting mechanism, I
would very much like to see an integration of IOMMU error handling into
the EDAC framework.

An exception is of course still the flags decoding in this patch-set.
This alone would be pretty compelling as long as there is no integration
into EDAC.


Joerg

2013-06-21 15:59:46

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 0/3] iommu/amd: IOMMU Error Reporting/Handling/Filtering

On Fri, Jun 21, 2013 at 05:15:00PM +0200, Joerg Roedel wrote:
> Instead of extending this bad scaling dmesg error-reporting mechanism,
> I would very much like to see an integration of IOMMU error handling
> into the EDAC framework.

Wouldn't just adding a tracepoint and collecting the logs in debugfs be
enough? This would at least avoid spamming syslog with the messages.

More sophisticated error handling can be done later...

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2013-06-21 16:24:59

by Joerg Roedel

[permalink] [raw]
Subject: Re: [PATCH 0/3] iommu/amd: IOMMU Error Reporting/Handling/Filtering

On Fri, Jun 21, 2013 at 05:59:33PM +0200, Borislav Petkov wrote:
> On Fri, Jun 21, 2013 at 05:15:00PM +0200, Joerg Roedel wrote:
> > Instead of extending this bad scaling dmesg error-reporting mechanism,
> > I would very much like to see an integration of IOMMU error handling
> > into the EDAC framework.
>
> Wouldn't just adding a tracepoint and collecting the logs in debugfs be
> enough? This would at least avoid spamming syslog with the messages.

I would rather not create another user-space interface for this which we
then have to keep forever because someone started using it. What we need
is a well defined mechanism to report these kernel metrics to userspace
which can then do with it whatever it wants. I think EDAC can provide
that interface.


Joerg

2013-06-21 16:44:09

by Shuah Khan

[permalink] [raw]
Subject: Re: [PATCH 0/3] iommu/amd: IOMMU Error Reporting/Handling/Filtering

On Fri, Jun 21, 2013 at 10:24 AM, Joerg Roedel <[email protected]> wrote:
> On Fri, Jun 21, 2013 at 05:59:33PM +0200, Borislav Petkov wrote:
>> On Fri, Jun 21, 2013 at 05:15:00PM +0200, Joerg Roedel wrote:
>> > Instead of extending this bad scaling dmesg error-reporting mechanism,
>> > I would very much like to see an integration of IOMMU error handling
>> > into the EDAC framework.
>>
>> Wouldn't just adding a tracepoint and collecting the logs in debugfs be
>> enough? This would at least avoid spamming syslog with the messages.
>
> I would rather not create another user-space interface for this which we
> then have to keep forever because someone started using it. What we need
> is a well defined mechanism to report these kernel metrics to userspace
> which can then do with it whatever it wants. I think EDAC can provide
> that interface.
>

I think it makes prefect sense to use EDAC as a common frame-work for
iommu errors as well.
Especially if and when APEI evolves to cover IOMMU tables (both Intel
VT-d and AMD) , since
EDAC is already capable of working with APEI, using EDAC will work
well on platforms that support
APEI as well as the ones that don't.

-- Shuah
Shuah Khan, Linux Kernel Developer - Open Source Group Samsung Research
America (Silicon Valley) [email protected] | (970) 672-0658

2013-06-21 17:36:38

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 0/3] iommu/amd: IOMMU Error Reporting/Handling/Filtering

On Fri, Jun 21, 2013 at 06:24:55PM +0200, Joerg Roedel wrote:
> I would rather not create another user-space interface for this which
> we then have to keep forever because someone started using it. What
> we need is a well defined mechanism to report these kernel metrics to
> userspace which can then do with it whatever it wants. I think EDAC
> can provide that interface.

What exactly do you think it should provide? A fast way to carry error
info to userspace (tracepoint) or some sort of error collection and
evaluation (need to do it yourself anyway - there's no such thing in
EDAC).

IOW, what specific functionality in EDAC are you actually thinking of?

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2013-06-24 07:26:52

by Joerg Roedel

[permalink] [raw]
Subject: Re: [PATCH 0/3] iommu/amd: IOMMU Error Reporting/Handling/Filtering

On Fri, Jun 21, 2013 at 07:36:25PM +0200, Borislav Petkov wrote:

> IOW, what specific functionality in EDAC are you actually thinking of?

Okay, my believe until yesterday was that EDAC provides a consistent way
to report hardware errors to userspace. That seems to be wrong. This in
mind I think having a trace-point for that makes sense.

But this trace-point must not be specific to a single IOMMU driver, I
want it generic enough so that it can be used by all common IOMMU
drivers to report hardware misbehaviors to userspace.


Thanks,

Joerg

2013-06-24 07:41:22

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 0/3] iommu/amd: IOMMU Error Reporting/Handling/Filtering

On Mon, Jun 24, 2013 at 09:26:48AM +0200, Joerg Roedel wrote:
> On Fri, Jun 21, 2013 at 07:36:25PM +0200, Borislav Petkov wrote:
>
> > IOW, what specific functionality in EDAC are you actually thinking of?
>
> Okay, my believe until yesterday was that EDAC provides a consistent way
> to report hardware errors to userspace. That seems to be wrong. This in
> mind I think having a trace-point for that makes sense.
>
> But this trace-point must not be specific to a single IOMMU driver, I
> want it generic enough so that it can be used by all common IOMMU
> drivers to report hardware misbehaviors to userspace.

Yep, either that or you define a TRACE_EVENT_CLASS which contains common
error info which all iommu drivers report, and those which want to
report more specific information pertaining only to them, then they need
to define a tracepoint based on the iommu event class.

Check the sources for examples.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--