2009-03-24 19:58:25

by Andy Lutomirski

[permalink] [raw]
Subject: 2.6.29: can't resume from suspend with DMAR (intel iommu) enabled

On vanilla 2.6.29 (on Ubuntu 8.10), on a Lenovo x200s, my system is
completely hosed on resume. It appears that even hard disk IO didn't
work (trying to do *anything* including getting a dmesg trace just
spewed sda io errors to the console). Hence no trace. I did an
alt-sysrq-b and the screen went blank and the machine just started
beeping at me.

Resume works much better with intel_iommu=off. (I remember seeing a
patch go by that purported to fix resume with IOMMU enabled, but it
didn't work for me.)

I'd be happy to try to make a better bug report if anyone has any bright ideas.

--Andy


Attachments:
config.gz (19.59 kB)

2009-03-24 20:03:28

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: 2.6.29: can't resume from suspend with DMAR (intel iommu) enabled

Added some CCs.

On Tuesday 24 March 2009, Andrew Lutomirski wrote:
> On vanilla 2.6.29 (on Ubuntu 8.10), on a Lenovo x200s, my system is
> completely hosed on resume. It appears that even hard disk IO didn't
> work (trying to do *anything* including getting a dmesg trace just
> spewed sda io errors to the console). Hence no trace. I did an
> alt-sysrq-b and the screen went blank and the machine just started
> beeping at me.
>
> Resume works much better with intel_iommu=off. (I remember seeing a
> patch go by that purported to fix resume with IOMMU enabled, but it
> didn't work for me.)
>
> I'd be happy to try to make a better bug report if anyone has any bright ideas.
>
> --Andy

2009-03-24 20:33:51

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.29: can't resume from suspend with DMAR (intel iommu) enabled


(Cc:s added)

* Andrew Lutomirski <[email protected]> wrote:

> On vanilla 2.6.29 (on Ubuntu 8.10), on a Lenovo x200s, my system
> is completely hosed on resume. It appears that even hard disk IO
> didn't work (trying to do *anything* including getting a dmesg
> trace just spewed sda io errors to the console). Hence no trace.
> I did an alt-sysrq-b and the screen went blank and the machine
> just started beeping at me.
>
> Resume works much better with intel_iommu=off. (I remember seeing
> a patch go by that purported to fix resume with IOMMU enabled, but
> it didn't work for me.)
>
> I'd be happy to try to make a better bug report if anyone has any
> bright ideas.

i have a Lenovo T500 that does not even boot with with DMAR enabled
in the BIOS (it's default-off), i get this panic in early bootup:

DMAR hardware is malfunctioning

So i dont get to test suspend/resume ;-)

Ingo

2009-03-24 20:37:01

by Fenghua Yu

[permalink] [raw]
Subject: RE: 2.6.29: can't resume from suspend with DMAR (intel iommu) enabled

>
>(Cc:s added)
>
>* Andrew Lutomirski <[email protected]> wrote:
>
>> On vanilla 2.6.29 (on Ubuntu 8.10), on a Lenovo x200s, my system
>> is completely hosed on resume. It appears that even hard disk IO
>> didn't work (trying to do *anything* including getting a dmesg
>> trace just spewed sda io errors to the console). Hence no trace.
>> I did an alt-sysrq-b and the screen went blank and the machine
>> just started beeping at me.
>>
>> Resume works much better with intel_iommu=off. (I remember seeing
>> a patch go by that purported to fix resume with IOMMU enabled, but
>> it didn't work for me.)
>>
>> I'd be happy to try to make a better bug report if anyone has any
>> bright ideas.
>
>i have a Lenovo T500 that does not even boot with with DMAR enabled
>in the BIOS (it's default-off), i get this panic in early bootup:
>
> DMAR hardware is malfunctioning
>
>So i dont get to test suspend/resume ;-)
>
Current kernel doesn't have iommu suspend/resume support yet. I'll send out suspend/resume patches today or tomorrow. Hope that will help.

Thanks.

-Fenghua

2009-03-24 20:49:07

by Kyle McMartin

[permalink] [raw]
Subject: Re: 2.6.29: can't resume from suspend with DMAR (intel iommu) enabled

On Tue, Mar 24, 2009 at 01:36:46PM -0700, Yu, Fenghua wrote:
> >
> >(Cc:s added)
> >
> >* Andrew Lutomirski <[email protected]> wrote:
> >
> >> On vanilla 2.6.29 (on Ubuntu 8.10), on a Lenovo x200s, my system
> >> is completely hosed on resume. It appears that even hard disk IO
> >> didn't work (trying to do *anything* including getting a dmesg
> >> trace just spewed sda io errors to the console). Hence no trace.
> >> I did an alt-sysrq-b and the screen went blank and the machine
> >> just started beeping at me.
> >>
> >> Resume works much better with intel_iommu=off. (I remember seeing
> >> a patch go by that purported to fix resume with IOMMU enabled, but
> >> it didn't work for me.)
> >>
> >> I'd be happy to try to make a better bug report if anyone has any
> >> bright ideas.
> >
> >i have a Lenovo T500 that does not even boot with with DMAR enabled
> >in the BIOS (it's default-off), i get this panic in early bootup:
> >
> > DMAR hardware is malfunctioning
> >
> >So i dont get to test suspend/resume ;-)
> >
> Current kernel doesn't have iommu suspend/resume support yet. I'll send out suspend/resume patches today or tomorrow. Hope that will help.
>

Heh, awesome, someone could have brought this, uhm, subtle, weakness when
things were getting defaulted on...

regards, Kyle

2009-03-25 17:28:36

by mark gross

[permalink] [raw]
Subject: Re: 2.6.29: can't resume from suspend with DMAR (intel iommu) enabled

On Tue, Mar 24, 2009 at 09:32:59PM +0100, Ingo Molnar wrote:
>
> (Cc:s added)
>
> * Andrew Lutomirski <[email protected]> wrote:
>
> > On vanilla 2.6.29 (on Ubuntu 8.10), on a Lenovo x200s, my system
> > is completely hosed on resume. It appears that even hard disk IO
> > didn't work (trying to do *anything* including getting a dmesg
> > trace just spewed sda io errors to the console). Hence no trace.
> > I did an alt-sysrq-b and the screen went blank and the machine
> > just started beeping at me.
> >
> > Resume works much better with intel_iommu=off. (I remember seeing
> > a patch go by that purported to fix resume with IOMMU enabled, but
> > it didn't work for me.)
> >
> > I'd be happy to try to make a better bug report if anyone has any
> > bright ideas.
>
> i have a Lenovo T500 that does not even boot with with DMAR enabled
> in the BIOS (it's default-off), i get this panic in early bootup:
>
> DMAR hardware is malfunctioning

That happens when the polling of the IOMMU registers fail to behave as
expected in the VT-d specification. Typically its been a bios issue
when this happens.

>
> So i dont get to test suspend/resume ;-)
>
> Ingo

2009-03-25 17:40:49

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.29: can't resume from suspend with DMAR (intel iommu) enabled


* mark gross <[email protected]> wrote:

> On Tue, Mar 24, 2009 at 09:32:59PM +0100, Ingo Molnar wrote:
> >
> > (Cc:s added)
> >
> > * Andrew Lutomirski <[email protected]> wrote:
> >
> > > On vanilla 2.6.29 (on Ubuntu 8.10), on a Lenovo x200s, my system
> > > is completely hosed on resume. It appears that even hard disk IO
> > > didn't work (trying to do *anything* including getting a dmesg
> > > trace just spewed sda io errors to the console). Hence no trace.
> > > I did an alt-sysrq-b and the screen went blank and the machine
> > > just started beeping at me.
> > >
> > > Resume works much better with intel_iommu=off. (I remember seeing
> > > a patch go by that purported to fix resume with IOMMU enabled, but
> > > it didn't work for me.)
> > >
> > > I'd be happy to try to make a better bug report if anyone has any
> > > bright ideas.
> >
> > i have a Lenovo T500 that does not even boot with with DMAR enabled
> > in the BIOS (it's default-off), i get this panic in early bootup:
> >
> > DMAR hardware is malfunctioning
>
> That happens when the polling of the IOMMU registers fail to
> behave as expected in the VT-d specification. Typically its been
> a bios issue when this happens.

it's hugely problematic to panic() the box early during bootup.
IOMMU code should be disabled instead, a warning emitted - and life
should continue.

With the current method you only ensure that distros turn DMAR
support off and that users disable it in their BIOS. That's a double
disadvantage.

Ingo

2009-03-25 17:53:42

by David Woodhouse

[permalink] [raw]
Subject: Re: 2.6.29: can't resume from suspend with DMAR (intel iommu) enabled

On Tue, 2009-03-24 at 21:32 +0100, Ingo Molnar wrote:
> (Cc:s added)
>
> * Andrew Lutomirski <[email protected]> wrote:
>
> > On vanilla 2.6.29 (on Ubuntu 8.10), on a Lenovo x200s, my system
> > is completely hosed on resume. It appears that even hard disk IO
> > didn't work (trying to do *anything* including getting a dmesg
> > trace just spewed sda io errors to the console). Hence no trace.
> > I did an alt-sysrq-b and the screen went blank and the machine
> > just started beeping at me.
> >
> > Resume works much better with intel_iommu=off. (I remember seeing
> > a patch go by that purported to fix resume with IOMMU enabled, but
> > it didn't work for me.)
> >
> > I'd be happy to try to make a better bug report if anyone has any
> > bright ideas.
>
> i have a Lenovo T500 that does not even boot with with DMAR enabled
> in the BIOS (it's default-off), i get this panic in early bootup:
>
> DMAR hardware is malfunctioning

Can you show me the output of the pr_debug() statement at dmar.c:543 (in
alloc_iommu()?

And bring me the head of a BIOS author.

--
David Woodhouse Open Source Technology Centre
[email protected] Intel Corporation

2009-03-25 18:00:06

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.29: can't resume from suspend with DMAR (intel iommu) enabled


* David Woodhouse <[email protected]> wrote:

> On Tue, 2009-03-24 at 21:32 +0100, Ingo Molnar wrote:
> > (Cc:s added)
> >
> > * Andrew Lutomirski <[email protected]> wrote:
> >
> > > On vanilla 2.6.29 (on Ubuntu 8.10), on a Lenovo x200s, my system
> > > is completely hosed on resume. It appears that even hard disk IO
> > > didn't work (trying to do *anything* including getting a dmesg
> > > trace just spewed sda io errors to the console). Hence no trace.
> > > I did an alt-sysrq-b and the screen went blank and the machine
> > > just started beeping at me.
> > >
> > > Resume works much better with intel_iommu=off. (I remember seeing
> > > a patch go by that purported to fix resume with IOMMU enabled, but
> > > it didn't work for me.)
> > >
> > > I'd be happy to try to make a better bug report if anyone has any
> > > bright ideas.
> >
> > i have a Lenovo T500 that does not even boot with with DMAR enabled
> > in the BIOS (it's default-off), i get this panic in early bootup:
> >
> > DMAR hardware is malfunctioning
>
> Can you show me the output of the pr_debug() statement at
> dmar.c:543 (in alloc_iommu()?

that's not easy - i use it right now :)

That's another reason why warnings and non-panic() behavior are
better for developers too. Had it not crashed i could have sent you
my dmesg and i would not have turned off DMAR in the BIOS.

Now it's turned off in my BIOS (first barrier) and i need to reboot
the kernel (second barrier) and i need to hack up a kernel in a
certain way to produce debug info (third barrier) - in the merge
window (fourth barrier ;-).

Ingo

2009-03-25 18:05:05

by David Woodhouse

[permalink] [raw]
Subject: Re: 2.6.29: can't resume from suspend with DMAR (intel iommu) enabled

On Wed, 2009-03-25 at 18:59 +0100, Ingo Molnar wrote:
>
> that's not easy - i use it right now :)
>
> That's another reason why warnings and non-panic() behavior are
> better for developers too. Had it not crashed i could have sent you
> my dmesg and i would not have turned off DMAR in the BIOS.
>
> Now it's turned off in my BIOS (first barrier) and i need to reboot
> the kernel (second barrier) and i need to hack up a kernel in a
> certain way to produce debug info (third barrier) - in the merge
> window (fourth barrier ;-).

Yeah, trusting BIOS monkeys for this was always going to be a bad plan.
We should have just known how to set/read the damn hardware BARs -- the
most likely explanation for this is that your BIOS is just lying to you
about where it put the registers, I believe.

I'd like to put in a basic sanity check when we first ioremap the
(alleged) DMAR registers. Hopefully, the output I asked for will confirm
that there's a simple way to do that...

--
David Woodhouse Open Source Technology Centre
[email protected] Intel Corporation

2009-03-25 18:08:19

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.29: can't resume from suspend with DMAR (intel iommu) enabled


* David Woodhouse <[email protected]> wrote:

> On Wed, 2009-03-25 at 18:59 +0100, Ingo Molnar wrote:
> >
> > that's not easy - i use it right now :)
> >
> > That's another reason why warnings and non-panic() behavior are
> > better for developers too. Had it not crashed i could have sent you
> > my dmesg and i would not have turned off DMAR in the BIOS.
> >
> > Now it's turned off in my BIOS (first barrier) and i need to reboot
> > the kernel (second barrier) and i need to hack up a kernel in a
> > certain way to produce debug info (third barrier) - in the merge
> > window (fourth barrier ;-).
>
> Yeah, trusting BIOS monkeys for this was always going to be a bad
> plan. We should have just known how to set/read the damn hardware
> BARs -- the most likely explanation for this is that your BIOS is
> just lying to you about where it put the registers, I believe.
>
> I'd like to put in a basic sanity check when we first ioremap the
> (alleged) DMAR registers. Hopefully, the output I asked for will
> confirm that there's a simple way to do that...

Could you please fix the panic() and add the debug output you'd like
to see? That would give me a kernel to run straight away. Without me
having to think much about what i should run and when.

(unless you really need this pr_debug info to proceed)

But it will be some time really. The laptop has 8 days uptime and is
not set up to run custom kernels at all.

Ingo

2009-03-25 18:10:37

by David Woodhouse

[permalink] [raw]
Subject: Re: 2.6.29: can't resume from suspend with DMAR (intel iommu) enabled

On Wed, 2009-03-25 at 19:07 +0100, Ingo Molnar wrote:
> * David Woodhouse <[email protected]> wrote:
>
> > On Wed, 2009-03-25 at 18:59 +0100, Ingo Molnar wrote:
> > >
> > > that's not easy - i use it right now :)
> > >
> > > That's another reason why warnings and non-panic() behavior are
> > > better for developers too. Had it not crashed i could have sent you
> > > my dmesg and i would not have turned off DMAR in the BIOS.
> > >
> > > Now it's turned off in my BIOS (first barrier) and i need to reboot
> > > the kernel (second barrier) and i need to hack up a kernel in a
> > > certain way to produce debug info (third barrier) - in the merge
> > > window (fourth barrier ;-).
> >
> > Yeah, trusting BIOS monkeys for this was always going to be a bad
> > plan. We should have just known how to set/read the damn hardware
> > BARs -- the most likely explanation for this is that your BIOS is
> > just lying to you about where it put the registers, I believe.
> >
> > I'd like to put in a basic sanity check when we first ioremap the
> > (alleged) DMAR registers. Hopefully, the output I asked for will
> > confirm that there's a simple way to do that...
>
> Could you please fix the panic() and add the debug output you'd like
> to see? That would give me a kernel to run straight away. Without me
> having to think much about what i should run and when.

That's distinctly non-trivial. I need to bail out early.

> (unless you really need this pr_debug info to proceed)
>
> But it will be some time really. The laptop has 8 days uptime and is
> not set up to run custom kernels at all.

Anyone else got a similar machine?

--
dwmw2

2009-03-25 18:20:44

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.29: can't resume from suspend with DMAR (intel iommu) enabled


* David Woodhouse <[email protected]> wrote:

> On Wed, 2009-03-25 at 19:07 +0100, Ingo Molnar wrote:
> > * David Woodhouse <[email protected]> wrote:
> >
> > > On Wed, 2009-03-25 at 18:59 +0100, Ingo Molnar wrote:
> > > >
> > > > that's not easy - i use it right now :)
> > > >
> > > > That's another reason why warnings and non-panic() behavior are
> > > > better for developers too. Had it not crashed i could have sent you
> > > > my dmesg and i would not have turned off DMAR in the BIOS.
> > > >
> > > > Now it's turned off in my BIOS (first barrier) and i need to reboot
> > > > the kernel (second barrier) and i need to hack up a kernel in a
> > > > certain way to produce debug info (third barrier) - in the merge
> > > > window (fourth barrier ;-).
> > >
> > > Yeah, trusting BIOS monkeys for this was always going to be a bad
> > > plan. We should have just known how to set/read the damn hardware
> > > BARs -- the most likely explanation for this is that your BIOS is
> > > just lying to you about where it put the registers, I believe.
> > >
> > > I'd like to put in a basic sanity check when we first ioremap the
> > > (alleged) DMAR registers. Hopefully, the output I asked for will
> > > confirm that there's a simple way to do that...
> >
> > Could you please fix the panic() and add the debug output you'd
> > like to see? That would give me a kernel to run straight away.
> > Without me having to think much about what i should run and
> > when.
>
> That's distinctly non-trivial. I need to bail out early.

yeah, DMA can start rather early, and we also need early resource
reservations, right? Or is there some other issue that makes it
difficult in addition to that?

Ingo

2009-03-25 18:43:49

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.29: can't resume from suspend with DMAR (intel iommu) enabled


btw., there's also the bugreport below.

Ingo

http://bugzilla.kernel.org/show_bug.cgi?id=12940

Summary: CONFIG_DMAR/CONFIG_DMAR_DEFAULT_ON makes the kernel
unbootable
Product: Platform Specific/Hardware
Version: 2.5
Kernel Version: 2.6.29
Platform: All
OS/Version: Linux
Tree: Mainline
Status: NEW
Severity: normal
Priority: P1
Component: x86-64
AssignedTo: [email protected]
ReportedBy: [email protected]
Regression: No


When i enable the options CONFIG_DMAR/CONFIG_DMAR_DEFAULT_ON my kernel fails to
boot. It just hangs at:

hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0
hpet0: 3 comparators, 64-bit 14.318180 MHz counter

right before that i see stuff saying that the IOMMU was remapping stuff. When i
encountered this i recompiled the kernel, disabled only
CONFIG_DMAR/CONFIG_DMAR_DEFAULT_ON and my kernel worked fine.


My hardware:
Tyan i5400PW/ 2x E5410 CPUs (latest BIOS)
8GB RAM
ATI radeonhd 3870
3ware 9650se

--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

2009-03-25 18:48:48

by Fenghua Yu

[permalink] [raw]
Subject: [PATCH 2/2] Intel IOMMU Suspend/Resume Support for Queued Invalidation

This patch supports suspend/resume for queued invalidation. During suspend/
resume, queued invalidation is disabled and then reenabled. This patch also
consolidate queued invalidation hardware operation into one function.

Signed-off-by: Fenghua Yu <[email protected]>

---

dmar.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++++++---------------
1 files changed, 52 insertions(+), 15 deletions(-)

diff --git a/drivers/pci/dmar.c b/drivers/pci/dmar.c
index d313039..b318bd1 100644
--- a/drivers/pci/dmar.c
+++ b/drivers/pci/dmar.c
@@ -790,14 +790,39 @@ end:
}

/*
+ * Enable queued invalidation.
+ */
+static void __dmar_enable_qi(struct intel_iommu *iommu)
+{
+ u32 cmd, sts;
+ unsigned long flags;
+ struct q_inval *qi = iommu->qi;
+
+ qi->free_head = qi->free_tail = 0;
+ qi->free_cnt = QI_LENGTH;
+
+ spin_lock_irqsave(&iommu->register_lock, flags);
+ /* write zero to the tail reg */
+ writel(0, iommu->reg + DMAR_IQT_REG);
+
+ dmar_writeq(iommu->reg + DMAR_IQA_REG, virt_to_phys(qi->desc));
+
+ cmd = iommu->gcmd | DMA_GCMD_QIE;
+ iommu->gcmd |= DMA_GCMD_QIE;
+ writel(cmd, iommu->reg + DMAR_GCMD_REG);
+
+ /* Make sure hardware complete it */
+ IOMMU_WAIT_OP(iommu, DMAR_GSTS_REG, readl, (sts & DMA_GSTS_QIES), sts);
+ spin_unlock_irqrestore(&iommu->register_lock, flags);
+}
+
+/*
* Enable Queued Invalidation interface. This is a must to support
* interrupt-remapping. Also used by DMA-remapping, which replaces
* register based IOTLB invalidation.
*/
int dmar_enable_qi(struct intel_iommu *iommu)
{
- u32 cmd, sts;
- unsigned long flags;
struct q_inval *qi;

if (!ecap_qis(iommu->ecap))
@@ -835,19 +860,7 @@ int dmar_enable_qi(struct intel_iommu *iommu)

spin_lock_init(&qi->q_lock);

- spin_lock_irqsave(&iommu->register_lock, flags);
- /* write zero to the tail reg */
- writel(0, iommu->reg + DMAR_IQT_REG);
-
- dmar_writeq(iommu->reg + DMAR_IQA_REG, virt_to_phys(qi->desc));
-
- cmd = iommu->gcmd | DMA_GCMD_QIE;
- iommu->gcmd |= DMA_GCMD_QIE;
- writel(cmd, iommu->reg + DMAR_GCMD_REG);
-
- /* Make sure hardware complete it */
- IOMMU_WAIT_OP(iommu, DMAR_GSTS_REG, readl, (sts & DMA_GSTS_QIES), sts);
- spin_unlock_irqrestore(&iommu->register_lock, flags);
+ __dmar_enable_qi(iommu);

return 0;
}
@@ -1102,3 +1115,27 @@ int __init enable_drhd_fault_handling(void)

return 0;
}
+
+/*
+ * Re-enable Queued Invalidation interface.
+ */
+int dmar_reenable_qi(struct intel_iommu *iommu)
+{
+ if (!ecap_qis(iommu->ecap))
+ return -ENOENT;
+
+ if (!iommu->qi)
+ return -ENOENT;
+
+ /*
+ * First disable queued invalidation.
+ */
+ dmar_disable_qi(iommu);
+ /* Then enable queued invalidation again. Since there is no pending
+ * invalidation requests now, it's safe to re-enable queued
+ * invalidation.
+ */
+ __dmar_enable_qi(iommu);
+
+ return 0;
+}

2009-03-25 18:49:06

by Fenghua Yu

[permalink] [raw]
Subject: [PATCH 1/2] Intel IOMMU Suspend/Resume Support for DMAR


Current Intel IOMMU does not support suspend and resume. In S3 event, kernel
crashes when IOMMU is enabled. The attached patch implements the suspend and
resume feature for Intel IOMMU. It hooks to kernel suspend and resume interface.
When suspend happens, it saves necessary hardware registers. When resume happens
it restores the registers and restarts IOMMU by enabling translation, setting
up root entry, and re-enabling queued invalidation.

A seperate patch which is for interrupt remapping suspend/resume will be sent
out in a day or two.

This patch set is applied on top of the tip tree.

Signed-off-by: Fenghua Yu <[email protected]>

---

drivers/pci/intel-iommu.c | 158 ++++++++++++++++++++++++++++++++++++++++++++
include/linux/intel-iommu.h | 4 +
2 files changed, 162 insertions(+)

diff --git a/drivers/pci/intel-iommu.c b/drivers/pci/intel-iommu.c
index 49402c3..a969bc8 100644
--- a/drivers/pci/intel-iommu.c
+++ b/drivers/pci/intel-iommu.c
@@ -36,6 +36,8 @@
#include <linux/iova.h>
#include <linux/iommu.h>
#include <linux/intel-iommu.h>
+#include <linux/sysdev.h>
+#include <asm/i8259.h>
#include <asm/cacheflush.h>
#include <asm/iommu.h>
#include "pci.h"
@@ -2563,6 +2565,161 @@ static void __init init_no_remapping_devices(void)
}
}

+#ifdef CONFIG_PM
+static int init_iommu_hw(void)
+{
+ struct dmar_drhd_unit *drhd;
+ struct intel_iommu *iommu;
+
+ for_each_drhd_unit(drhd) {
+ if (drhd->ignored)
+ continue;
+
+ iommu = drhd->iommu;
+
+ if (iommu->qi)
+ dmar_reenable_qi(iommu);
+ }
+
+ for_each_drhd_unit(drhd) {
+ if (drhd->ignored)
+ continue;
+
+ iommu = drhd->iommu;
+
+ iommu_flush_write_buffer(iommu);
+
+ iommu_set_root_entry(iommu);
+
+ iommu->flush.flush_context(iommu, 0, 0, 0, DMA_CCMD_GLOBAL_INVL,
+ 0);
+ iommu->flush.flush_iotlb(iommu, 0, 0, 0, DMA_TLB_GLOBAL_FLUSH,
+ 0);
+ iommu_disable_protect_mem_regions(iommu);
+ iommu_enable_translation(iommu);
+ }
+
+ return 0;
+}
+
+static void iommu_flush_all(void)
+{
+ struct dmar_drhd_unit *drhd;
+ struct intel_iommu *iommu;
+
+ for_each_drhd_unit(drhd) {
+ if (drhd->ignored)
+ continue;
+
+ iommu = drhd->iommu;
+ iommu->flush.flush_context(iommu, 0, 0, 0, DMA_CCMD_GLOBAL_INVL,
+ 0);
+ iommu->flush.flush_iotlb(iommu, 0, 0, 0, DMA_TLB_GLOBAL_FLUSH,
+ 0);
+ }
+}
+
+static u32 iommu_state[MAX_IOMMUS][MAX_IOMMU_REGS];
+
+static int iommu_suspend(struct sys_device *dev, pm_message_t state)
+{
+ struct dmar_drhd_unit *drhd;
+ struct intel_iommu *iommu;
+ unsigned long flag;
+ int i = 0;
+
+ iommu_flush_all();
+
+ for_each_drhd_unit(drhd) {
+ if (drhd->ignored)
+ continue;
+
+ iommu = drhd->iommu;
+
+ iommu_disable_translation(iommu);
+
+ spin_lock_irqsave(&iommu->register_lock, flag);
+
+ iommu_state[i][DMAR_FECTL_REG] =
+ (u32) readl(iommu->reg + DMAR_FECTL_REG);
+ iommu_state[i][DMAR_FEDATA_REG] =
+ (u32) readl(iommu->reg + DMAR_FEDATA_REG);
+ iommu_state[i][DMAR_FEADDR_REG] =
+ (u32) readl(iommu->reg + DMAR_FEADDR_REG);
+ iommu_state[i][DMAR_FEUADDR_REG] =
+ (u32) readl(iommu->reg + DMAR_FEUADDR_REG);
+
+ spin_unlock_irqrestore(&iommu->register_lock, flag);
+ i++;
+ }
+ return 0;
+}
+
+static int iommu_resume(struct sys_device *dev)
+{
+ struct dmar_drhd_unit *drhd;
+ struct intel_iommu *iommu;
+ unsigned long flag;
+ int i = 0;
+
+ if (init_iommu_hw())
+ panic("IOMMU setup failed, DMAR can not start!\n");
+
+ for_each_drhd_unit(drhd) {
+ if (drhd->ignored)
+ continue;
+
+ iommu = drhd->iommu;
+
+ spin_lock_irqsave(&iommu->register_lock, flag);
+
+ writel((u32) iommu_state[i][DMAR_FECTL_REG],
+ iommu->reg + DMAR_FECTL_REG);
+ writel((u32) iommu_state[i][DMAR_FEDATA_REG],
+ iommu->reg + DMAR_FEDATA_REG);
+ writel((u32) iommu_state[i][DMAR_FEADDR_REG],
+ iommu->reg + DMAR_FEADDR_REG);
+ writel((u32) iommu_state[i][DMAR_FEUADDR_REG],
+ iommu->reg + DMAR_FEUADDR_REG);
+
+ spin_unlock_irqrestore(&iommu->register_lock, flag);
+ i++;
+ }
+ return 0;
+}
+
+static struct sysdev_class iommu_sysclass = {
+ .name = "iommu",
+ .resume = iommu_resume,
+ .suspend = iommu_suspend,
+};
+
+static struct sys_device device_iommu = {
+ .cls = &iommu_sysclass,
+};
+
+static int __init init_iommu_sysfs(void)
+{
+ int error;
+
+ error = sysdev_class_register(&iommu_sysclass);
+ if (error)
+ return error;
+
+ error = sysdev_register(&device_iommu);
+ if (error)
+ sysdev_class_unregister(&iommu_sysclass);
+
+ return error;
+}
+
+#else
+static init __init init_iommu_sysfs(void)
+{
+ return 0;
+}
+#endif /* CONFIG_PM */
+
int __init intel_iommu_init(void)
{
int ret = 0;
@@ -2598,6 +2755,7 @@ int __init intel_iommu_init(void)
init_timer(&unmap_timer);
force_iommu = 1;
dma_ops = &intel_dma_ops;
+ init_iommu_sysfs();

register_iommu(&intel_iommu_ops);

diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 1d6c71d..5ec836b 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -205,6 +205,9 @@ static inline void dmar_writeq(void __iomem *addr, u64 val)
/* low 64 bit */
#define dma_frcd_page_addr(d) (d & (((u64)-1) << PAGE_SHIFT))

+#define MAX_IOMMUS 32
+#define MAX_IOMMU_REGS 0xc0
+
#define IOMMU_WAIT_OP(iommu, offset, op, cond, sts) \
do { \
cycles_t start_time = get_cycles(); \
@@ -322,6 +325,7 @@ extern int alloc_iommu(struct dmar_drhd_unit *drhd);
extern void free_iommu(struct intel_iommu *iommu);
extern int dmar_enable_qi(struct intel_iommu *iommu);
extern void dmar_disable_qi(struct intel_iommu *iommu);
+extern int dmar_reenable_qi(struct intel_iommu *iommu);
extern void qi_global_iec(struct intel_iommu *iommu);

extern int qi_flush_context(struct intel_iommu *iommu, u16 did, u16 sid,

2009-04-06 20:56:48

by David Woodhouse

[permalink] [raw]
Subject: Re: 2.6.29: can't resume from suspend with DMAR (intel iommu) enabled

On Wed, 2009-03-25 at 18:59 +0100, Ingo Molnar wrote:
> > > i have a Lenovo T500 that does not even boot with with DMAR enabled
> > > in the BIOS (it's default-off), i get this panic in early bootup:
> > >
> > > DMAR hardware is malfunctioning
> >
> > Can you show me the output of the pr_debug() statement at
> > dmar.c:543 (in alloc_iommu()?
>
> that's not easy - i use it right now :)
>
> That's another reason why warnings and non-panic() behavior are
> better for developers too. Had it not crashed i could have sent you
> my dmesg and i would not have turned off DMAR in the BIOS.
>
> Now it's turned off in my BIOS (first barrier) and i need to reboot
> the kernel (second barrier) and i need to hack up a kernel in a
> certain way to produce debug info (third barrier) - in the merge
> window (fourth barrier ;-).

If you manage to get a chance to do this after the merge window closes,
that would be much appreciated.

I have a T400 now, which will help with certain things (at least I have
a machine on which I can test suspend/resume). But I don't think I can
reproduce your panic on it.

Dirk reported the same panic, but we've just turned his iommu back on
and completely failed to reproduce the problem -- so I'm kind of stuck.

--
David Woodhouse Open Source Technology Centre
[email protected] Intel Corporation

2009-04-07 07:57:14

by Matthew Garrett

[permalink] [raw]
Subject: Re: 2.6.29: can't resume from suspend with DMAR (intel iommu) enabled

On Mon, Apr 06, 2009 at 01:56:01PM -0700, David Woodhouse wrote:

> Dirk reported the same panic, but we've just turned his iommu back on
> and completely failed to reproduce the problem -- so I'm kind of stuck.

I saw this on a T400, but only on the first boot after enabling DMAR -
power cycling "fixed" it.

--
Matthew Garrett | [email protected]

2009-04-07 11:11:26

by David Woodhouse

[permalink] [raw]
Subject: Re: 2.6.29: can't resume from suspend with DMAR (intel iommu) enabled

On Tue, 2009-04-07 at 08:56 +0100, Matthew Garrett wrote:
> On Mon, Apr 06, 2009 at 01:56:01PM -0700, David Woodhouse wrote:
>
> > Dirk reported the same panic, but we've just turned his iommu back on
> > and completely failed to reproduce the problem -- so I'm kind of stuck.
>
> I saw this on a T400, but only on the first boot after enabling DMAR -
> power cycling "fixed" it.

Aha, that's interesting to know. Thanks.

--
David Woodhouse Open Source Technology Centre
[email protected] Intel Corporation

2009-04-10 21:28:23

by David Woodhouse

[permalink] [raw]
Subject: Re: 2.6.29: can't resume from suspend with DMAR (intel iommu) enabled

On Tue, 2009-04-07 at 08:56 +0100, Matthew Garrett wrote:
> On Mon, Apr 06, 2009 at 01:56:01PM -0700, David Woodhouse wrote:
>
> > Dirk reported the same panic, but we've just turned his iommu back on
> > and completely failed to reproduce the problem -- so I'm kind of stuck.
>
> I saw this on a T400, but only on the first boot after enabling DMAR -
> power cycling "fixed" it.

Ah, I can reproduce now -- thanks for spotting that. I'll come up with
something that can spot this failure mode early and bail out. With a
nasty message about BIOS engineers.

--
dwmw2

2009-04-10 22:46:56

by David Woodhouse

[permalink] [raw]
Subject: Re: 2.6.29: can't resume from suspend with DMAR (intel iommu) enabled

On Tue, 2009-03-24 at 16:40 -0400, Kyle McMartin wrote:
> > Current kernel doesn't have iommu suspend/resume support yet. I'll
> > send out suspend/resume patches today or tomorrow. Hope that will
> > help.
>
> Heh, awesome, someone could have brought this, uhm, subtle, weakness
> when things were getting defaulted on...

git://git.infradead.org/~dwmw2/iommu-suspend-2.6.29.git

That's just for IOMMU, not interrupt remapping (yet).

--
David Woodhouse Open Source Technology Centre
[email protected] Intel Corporation

2009-04-10 23:21:21

by Fenghua Yu

[permalink] [raw]
Subject: RE: 2.6.29: can't resume from suspend with DMAR (intel iommu) enabled

>-----Original Message-----
>From: David Woodhouse [mailto:[email protected]]
>Sent: Friday, April 10, 2009 3:46 PM
>To: Kyle McMartin
>Cc: Yu, Fenghua; 'Ingo Molnar'; 'Andrew Lutomirski'; 'Jesse Barnes';
>Siddha, Suresh B; 'Yinghai Lu'; 'Mark Gross'; 'LKML'
>Subject: Re: 2.6.29: can't resume from suspend with DMAR (intel iommu)
>enabled
>
>On Tue, 2009-03-24 at 16:40 -0400, Kyle McMartin wrote:
>> > Current kernel doesn't have iommu suspend/resume support yet. I'll
>> > send out suspend/resume patches today or tomorrow. Hope that will
>> > help.
>>
>> Heh, awesome, someone could have brought this, uhm, subtle, weakness
>> when things were getting defaulted on...
>
>git://git.infradead.org/~dwmw2/iommu-suspend-2.6.29.git
>
>That's just for IOMMU, not interrupt remapping (yet).
>

Intr remapping needs to be pack ported to 2.6.29 from kernel git tree.

Thanks.

-Fenghua

2009-04-11 00:49:51

by Suresh Siddha

[permalink] [raw]
Subject: Re: 2.6.29: can't resume from suspend with DMAR (intel iommu) enabled

On Fri, 2009-04-10 at 15:46 -0700, David Woodhouse wrote:
> On Tue, 2009-03-24 at 16:40 -0400, Kyle McMartin wrote:
> > > Current kernel doesn't have iommu suspend/resume support yet. I'll
> > > send out suspend/resume patches today or tomorrow. Hope that will
> > > help.
> >
> > Heh, awesome, someone could have brought this, uhm, subtle, weakness
> > when things were getting defaulted on...
>
> git://git.infradead.org/~dwmw2/iommu-suspend-2.6.29.git
>
> That's just for IOMMU, not interrupt remapping (yet).

Is this for 2.6.29-stable series? Is this really critical to have it for
-stable?

2009-04-11 02:13:36

by David Woodhouse

[permalink] [raw]
Subject: Re: 2.6.29: can't resume from suspend with DMAR (intel iommu) enabled

On Fri, 2009-04-10 at 17:48 -0700, Suresh Siddha wrote:
> On Fri, 2009-04-10 at 15:46 -0700, David Woodhouse wrote:
> > On Tue, 2009-03-24 at 16:40 -0400, Kyle McMartin wrote:
> > > > Current kernel doesn't have iommu suspend/resume support yet. I'll
> > > > send out suspend/resume patches today or tomorrow. Hope that will
> > > > help.
> > >
> > > Heh, awesome, someone could have brought this, uhm, subtle, weakness
> > > when things were getting defaulted on...
> >
> > git://git.infradead.org/~dwmw2/iommu-suspend-2.6.29.git
> >
> > That's just for IOMMU, not interrupt remapping (yet).
>
> Is this for 2.6.29-stable series? Is this really critical to have it for
> -stable?

I suspect not. We don't need it on Cantiga, and I think that's all we
really care about for 2.6.29-stable. Non-laptop chipsets, or laptop
chipsets that aren't actually out there in the wild yet, can wait for
2.6.30.

The less we have to backport, the better -- and suspend/resume support
for interrupt remapping would also require backporting a bunch of other
changes in the apic code.

At least what I have in the above tree so far is self-contained in the
VT-d code¹.

--
David Woodhouse Open Source Technology Centre
[email protected] Intel Corporation

¹ Conveniently forgetting the part where we change the definition of the
if() macro...

2009-04-11 06:04:49

by David Woodhouse

[permalink] [raw]
Subject: Re: 2.6.29: can't resume from suspend with DMAR (intel iommu) enabled

On Tue, 2009-04-07 at 08:56 +0100, Matthew Garrett wrote:
> On Mon, Apr 06, 2009 at 01:56:01PM -0700, David Woodhouse wrote:
>
> > Dirk reported the same panic, but we've just turned his iommu back on
> > and completely failed to reproduce the problem -- so I'm kind of stuck.
>
> I saw this on a T400, but only on the first boot after enabling DMAR -
> power cycling "fixed" it.

OK, that should be fixed (amongst other things) in
git.infradead.org/users/dwmw2/iommu-suspend-2.6.29.git and in the Fedora
2.6.29.1-68.fc11 kernel. Thanks.

http://www.kerneloops.org/raw.php?rawid=346794

--
David Woodhouse Open Source Technology Centre
[email protected] Intel Corporation

2009-04-11 14:39:58

by Kyle McMartin

[permalink] [raw]
Subject: Re: 2.6.29: can't resume from suspend with DMAR (intel iommu) enabled

On Fri, Apr 10, 2009 at 11:04:12PM -0700, David Woodhouse wrote:
> On Tue, 2009-04-07 at 08:56 +0100, Matthew Garrett wrote:
> > On Mon, Apr 06, 2009 at 01:56:01PM -0700, David Woodhouse wrote:
> >
> > > Dirk reported the same panic, but we've just turned his iommu back on
> > > and completely failed to reproduce the problem -- so I'm kind of stuck.
> >
> > I saw this on a T400, but only on the first boot after enabling DMAR -
> > power cycling "fixed" it.
>
> OK, that should be fixed (amongst other things) in
> git.infradead.org/users/dwmw2/iommu-suspend-2.6.29.git and in the Fedora
> 2.6.29.1-68.fc11 kernel. Thanks.
>
> http://www.kerneloops.org/raw.php?rawid=346794
>

Does this include backporting the interrupt remapping fixes, or just
ignoring it since none of the current laptop chipsets support it?

regards, Kyle

2009-04-11 16:53:36

by David Woodhouse

[permalink] [raw]
Subject: Re: 2.6.29: can't resume from suspend with DMAR (intel iommu) enabled

On Sat, 2009-04-11 at 10:38 -0400, Kyle McMartin wrote:
> On Fri, Apr 10, 2009 at 11:04:12PM -0700, David Woodhouse wrote:
> > On Tue, 2009-04-07 at 08:56 +0100, Matthew Garrett wrote:
> > > On Mon, Apr 06, 2009 at 01:56:01PM -0700, David Woodhouse wrote:
> > >
> > > > Dirk reported the same panic, but we've just turned his iommu back on
> > > > and completely failed to reproduce the problem -- so I'm kind of stuck.
> > >
> > > I saw this on a T400, but only on the first boot after enabling DMAR -
> > > power cycling "fixed" it.
> >
> > OK, that should be fixed (amongst other things) in
> > git.infradead.org/users/dwmw2/iommu-suspend-2.6.29.git and in the Fedora
> > 2.6.29.1-68.fc11 kernel. Thanks.
> >
> > http://www.kerneloops.org/raw.php?rawid=346794
> >
>
> Does this include backporting the interrupt remapping fixes, or just
> ignoring it since none of the current laptop chipsets support it?

The latter. What I've done so far is quite large for a -stable
submission already. The interrupt remapping stuff would be even more
intrusive, and I'm entirely unconvinced that we actually _need_ it for
current systems.

I could make it refuse to suspend in intr_remap is enabled, I suppose --
that would be simple enough.

--
David Woodhouse Open Source Technology Centre
[email protected] Intel Corporation

2009-04-11 17:15:54

by Kyle McMartin

[permalink] [raw]
Subject: Re: 2.6.29: can't resume from suspend with DMAR (intel iommu) enabled

On Sat, Apr 11, 2009 at 09:52:39AM -0700, David Woodhouse wrote:
> > Does this include backporting the interrupt remapping fixes, or just
> > ignoring it since none of the current laptop chipsets support it?
>
> The latter. What I've done so far is quite large for a -stable
> submission already. The interrupt remapping stuff would be even more
> intrusive, and I'm entirely unconvinced that we actually _need_ it for
> current systems.
>
> I could make it refuse to suspend in intr_remap is enabled, I suppose --
> that would be simple enough.
>

Right, that's what I was afraid of. I think refusing to suspend is probably
a good idea. I'm probably not the only person who suspends his desktop
machines when they're not in use...

regards, Kyle