2009-12-14 03:42:34

by FUJITA Tomonori

[permalink] [raw]
Subject: [PATCH 0/2] x86: gart: fix the HW IOMMU initialization rewrite breakage

This patchset fixes GART issues due to the commit
75f1cdf1dda92cae037ec848ae63690d91913eac (x86: Handle HW IOMMU
initialization failure gracefully).

This is a revised version of:

http://marc.info/?l=linux-kernel&m=126027050409121&w=2

I replaced the first patch with a simpler one (*1). With the new
patch, systems that use swiotlb temporarily allocate more memory than
they do now. But swiotlb uses any area in ZONE_DMA32 so I doubt that
it matters. If swiotlb can't allocate memory, the system panics. So
there is no danger of data corruption (or something really bad). We
can use more complicated tricks (such as I posted before) when we find
that this simple approach doesn't work.

(*1) http://marc.info/?l=linux-kernel&m=125905247710259&w=2


2009-12-14 03:42:54

by FUJITA Tomonori

[permalink] [raw]
Subject: [PATCH 1/2] x86: move swiotlb initialization before dma32_free_bootmem

The commit 75f1cdf1dda92cae037ec848ae63690d91913eac introduced a bug
that we initialize SWIOTLB right after dma32_free_bootmem so we
wrongly steal memory area allocated for GART with broken BIOS earlier.

This moves swiotlb initialization before dma32_free_bootmem().

Signed-off-by: FUJITA Tomonori <[email protected]>
---
arch/x86/kernel/pci-dma.c | 5 ++++-
1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
index afcc58b..fcc2f2b 100644
--- a/arch/x86/kernel/pci-dma.c
+++ b/arch/x86/kernel/pci-dma.c
@@ -120,11 +120,14 @@ static void __init dma32_free_bootmem(void)

void __init pci_iommu_alloc(void)
{
+ int use_swiotlb;
+
+ use_swiotlb = pci_swiotlb_init();
#ifdef CONFIG_X86_64
/* free the range so iommu could get some range less than 4G */
dma32_free_bootmem();
#endif
- if (pci_swiotlb_init())
+ if (use_swiotlb)
return;

gart_iommu_hole_init();
--
1.5.6.5

2009-12-14 03:42:47

by FUJITA Tomonori

[permalink] [raw]
Subject: [PATCH 2/2] x86: gart: fix breakage due to IOMMU initialization cleanup

From: Yinghai Lu <[email protected]>

This fixes the following breakage of the commit
75f1cdf1dda92cae037ec848ae63690d91913eac:

- GART systems that don't AGP with broken BIOS and more than 4GB
memory are forced to use swiotlb. They can allocate aperture by hand
and use GART.

- GART systems without GAP must disable GART on shutdown.

- swiotlb usage is forced by the boot option, gart_iommu_hole_init()
is not called. so we disable GART early_gart_iommu_check().

Signed-off-by: Yinghai Lu <[email protected]>
Signed-off-by: FUJITA Tomonori <[email protected]>
---
arch/x86/kernel/aperture_64.c | 11 ++++++-----
arch/x86/kernel/pci-gart_64.c | 3 ++-
2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/aperture_64.c b/arch/x86/kernel/aperture_64.c
index e0dfb68..3704997 100644
--- a/arch/x86/kernel/aperture_64.c
+++ b/arch/x86/kernel/aperture_64.c
@@ -280,7 +280,8 @@ void __init early_gart_iommu_check(void)
* or BIOS forget to put that in reserved.
* try to update e820 to make that region as reserved.
*/
- int i, fix, slot;
+ u32 agp_aper_base = 0, agp_aper_order = 0;
+ int i, fix, slot, valid_agp = 0;
u32 ctl;
u32 aper_size = 0, aper_order = 0, last_aper_order = 0;
u64 aper_base = 0, last_aper_base = 0;
@@ -290,6 +291,8 @@ void __init early_gart_iommu_check(void)
return;

/* This is mostly duplicate of iommu_hole_init */
+ agp_aper_base = search_agp_bridge(&agp_aper_order, &valid_agp);
+
fix = 0;
for (i = 0; i < ARRAY_SIZE(bus_dev_ranges); i++) {
int bus;
@@ -342,10 +345,10 @@ void __init early_gart_iommu_check(void)
}
}

- if (!fix)
+ if (valid_agp)
return;

- /* different nodes have different setting, disable them all at first*/
+ /* disable them all at first */
for (i = 0; i < ARRAY_SIZE(bus_dev_ranges); i++) {
int bus;
int dev_base, dev_limit;
@@ -458,8 +461,6 @@ out:

if (aper_alloc) {
/* Got the aperture from the AGP bridge */
- } else if (!valid_agp) {
- /* Do nothing */
} else if ((!no_iommu && max_pfn > MAX_DMA32_PFN) ||
force_iommu ||
valid_agp ||
diff --git a/arch/x86/kernel/pci-gart_64.c b/arch/x86/kernel/pci-gart_64.c
index e6a0d40..56c0e73 100644
--- a/arch/x86/kernel/pci-gart_64.c
+++ b/arch/x86/kernel/pci-gart_64.c
@@ -710,7 +710,8 @@ static void gart_iommu_shutdown(void)
struct pci_dev *dev;
int i;

- if (no_agp)
+ /* don't shutdown it if there is AGP installed */
+ if (!no_agp)
return;

for (i = 0; i < num_k8_northbridges; i++) {
--
1.5.6.5

2009-12-14 09:49:54

by FUJITA Tomonori

[permalink] [raw]
Subject: [tip:x86/urgent] x86: Move swiotlb initialization before dma32_free_bootmem

Commit-ID: f4780ca005404166cc40af77ef0e86132ab98a81
Gitweb: http://git.kernel.org/tip/f4780ca005404166cc40af77ef0e86132ab98a81
Author: FUJITA Tomonori <[email protected]>
AuthorDate: Mon, 14 Dec 2009 11:52:14 +0900
Committer: Ingo Molnar <[email protected]>
CommitDate: Mon, 14 Dec 2009 08:57:40 +0100

x86: Move swiotlb initialization before dma32_free_bootmem

The commit 75f1cdf1dda92cae037ec848ae63690d91913eac introduced a
bug that we initialize SWIOTLB right after dma32_free_bootmem so
we wrongly steal memory area allocated for GART with broken BIOS
earlier.

This moves swiotlb initialization before dma32_free_bootmem().

Signed-off-by: FUJITA Tomonori <[email protected]>
Cc: [email protected]
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/kernel/pci-dma.c | 5 ++++-
1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
index afcc58b..fcc2f2b 100644
--- a/arch/x86/kernel/pci-dma.c
+++ b/arch/x86/kernel/pci-dma.c
@@ -120,11 +120,14 @@ static void __init dma32_free_bootmem(void)

void __init pci_iommu_alloc(void)
{
+ int use_swiotlb;
+
+ use_swiotlb = pci_swiotlb_init();
#ifdef CONFIG_X86_64
/* free the range so iommu could get some range less than 4G */
dma32_free_bootmem();
#endif
- if (pci_swiotlb_init())
+ if (use_swiotlb)
return;

gart_iommu_hole_init();

2009-12-14 09:49:41

by Yinghai Lu

[permalink] [raw]
Subject: [tip:x86/urgent] x86: Gart: fix breakage due to IOMMU initialization cleanup

Commit-ID: f3eee54276dfd1117fd94259f2b4a38388264724
Gitweb: http://git.kernel.org/tip/f3eee54276dfd1117fd94259f2b4a38388264724
Author: Yinghai Lu <[email protected]>
AuthorDate: Mon, 14 Dec 2009 11:52:15 +0900
Committer: Ingo Molnar <[email protected]>
CommitDate: Mon, 14 Dec 2009 08:57:40 +0100

x86: Gart: fix breakage due to IOMMU initialization cleanup

This fixes the following breakage of the commit
75f1cdf1dda92cae037ec848ae63690d91913eac:

- GART systems that don't AGP with broken BIOS and more than 4GB
memory are forced to use swiotlb. They can allocate aperture by
hand and use GART.

- GART systems without GAP must disable GART on shutdown.

- swiotlb usage is forced by the boot option,
gart_iommu_hole_init() is not called, so we disable GART
early_gart_iommu_check().

Signed-off-by: Yinghai Lu <[email protected]>
Signed-off-by: FUJITA Tomonori <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/kernel/aperture_64.c | 11 ++++++-----
arch/x86/kernel/pci-gart_64.c | 3 ++-
2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/aperture_64.c b/arch/x86/kernel/aperture_64.c
index e0dfb68..3704997 100644
--- a/arch/x86/kernel/aperture_64.c
+++ b/arch/x86/kernel/aperture_64.c
@@ -280,7 +280,8 @@ void __init early_gart_iommu_check(void)
* or BIOS forget to put that in reserved.
* try to update e820 to make that region as reserved.
*/
- int i, fix, slot;
+ u32 agp_aper_base = 0, agp_aper_order = 0;
+ int i, fix, slot, valid_agp = 0;
u32 ctl;
u32 aper_size = 0, aper_order = 0, last_aper_order = 0;
u64 aper_base = 0, last_aper_base = 0;
@@ -290,6 +291,8 @@ void __init early_gart_iommu_check(void)
return;

/* This is mostly duplicate of iommu_hole_init */
+ agp_aper_base = search_agp_bridge(&agp_aper_order, &valid_agp);
+
fix = 0;
for (i = 0; i < ARRAY_SIZE(bus_dev_ranges); i++) {
int bus;
@@ -342,10 +345,10 @@ void __init early_gart_iommu_check(void)
}
}

- if (!fix)
+ if (valid_agp)
return;

- /* different nodes have different setting, disable them all at first*/
+ /* disable them all at first */
for (i = 0; i < ARRAY_SIZE(bus_dev_ranges); i++) {
int bus;
int dev_base, dev_limit;
@@ -458,8 +461,6 @@ out:

if (aper_alloc) {
/* Got the aperture from the AGP bridge */
- } else if (!valid_agp) {
- /* Do nothing */
} else if ((!no_iommu && max_pfn > MAX_DMA32_PFN) ||
force_iommu ||
valid_agp ||
diff --git a/arch/x86/kernel/pci-gart_64.c b/arch/x86/kernel/pci-gart_64.c
index e6a0d40..56c0e73 100644
--- a/arch/x86/kernel/pci-gart_64.c
+++ b/arch/x86/kernel/pci-gart_64.c
@@ -710,7 +710,8 @@ static void gart_iommu_shutdown(void)
struct pci_dev *dev;
int i;

- if (no_agp)
+ /* don't shutdown it if there is AGP installed */
+ if (!no_agp)
return;

for (i = 0; i < num_k8_northbridges; i++) {

2009-12-14 11:02:21

by Yinghai Lu

[permalink] [raw]
Subject: Re: [tip:x86/urgent] x86: Move swiotlb initialization before dma32_free_bootmem

tip-bot for FUJITA Tomonori wrote:
> Commit-ID: f4780ca005404166cc40af77ef0e86132ab98a81
> Gitweb: http://git.kernel.org/tip/f4780ca005404166cc40af77ef0e86132ab98a81
> Author: FUJITA Tomonori <[email protected]>
> AuthorDate: Mon, 14 Dec 2009 11:52:14 +0900
> Committer: Ingo Molnar <[email protected]>
> CommitDate: Mon, 14 Dec 2009 08:57:40 +0100
>
> x86: Move swiotlb initialization before dma32_free_bootmem
>
> The commit 75f1cdf1dda92cae037ec848ae63690d91913eac introduced a
> bug that we initialize SWIOTLB right after dma32_free_bootmem so
> we wrongly steal memory area allocated for GART with broken BIOS
> earlier.
>
> This moves swiotlb initialization before dma32_free_bootmem().
>
> Signed-off-by: FUJITA Tomonori <[email protected]>
> Cc: [email protected]
> LKML-Reference: <[email protected]>
> Signed-off-by: Ingo Molnar <[email protected]>
> ---
> arch/x86/kernel/pci-dma.c | 5 ++++-
> 1 files changed, 4 insertions(+), 1 deletions(-)
>
> diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
> index afcc58b..fcc2f2b 100644
> --- a/arch/x86/kernel/pci-dma.c
> +++ b/arch/x86/kernel/pci-dma.c
> @@ -120,11 +120,14 @@ static void __init dma32_free_bootmem(void)
>
> void __init pci_iommu_alloc(void)
> {
> + int use_swiotlb;
> +
> + use_swiotlb = pci_swiotlb_init();
> #ifdef CONFIG_X86_64
> /* free the range so iommu could get some range less than 4G */
> dma32_free_bootmem();
> #endif
> - if (pci_swiotlb_init())
> + if (use_swiotlb)
> return;
>
> gart_iommu_hole_init();
> --

on one 1t socket system numa is enabled, got

[ 0.000000] bootmem alloc of 67108864 bytes failed!
[ 0.000000] Kernel panic - not syncing: Out of memory
[ 0.000000] Pid: 0, comm: swapper Not tainted 2.6.32-tip-yh-08635-g3614d3a-dirty #902
[ 0.000000] Call Trace:
[ 0.000000] [<ffffffff81cb3fec>] panic+0x7a/0x136
[ 0.000000] [<ffffffff81cb7819>] ? _raw_spin_unlock_irqrestore+0x3d/0x4c
[ 0.000000] [<ffffffff810a5866>] ? trace_hardirqs_off+0xd/0xf
[ 0.000000] [<ffffffff8273d055>] ___alloc_bootmem_node+0x0/0x61
[ 0.000000] [<ffffffff8273d1c1>] __alloc_bootmem_low+0xe/0x10
[ 0.000000] [<ffffffff82749b4b>] swiotlb_init_with_default_size+0x41/0x127
[ 0.000000] [<ffffffff82749c41>] swiotlb_init+0x10/0x12
[ 0.000000] [<ffffffff827326ef>] pci_swiotlb_init+0x52/0x67
[ 0.000000] [<ffffffff81cbfbfc>] ? _etext+0x0/0x24
[ 0.000000] [<ffffffff827276dd>] pci_iommu_alloc+0xc/0x6e
[ 0.000000] [<ffffffff81cbfbfc>] ? _etext+0x0/0x24
[ 0.000000] [<ffffffff82735037>] mem_init+0x19/0xec
[ 0.000000] [<ffffffff81caa52e>] ? x86_init_noop+0x9/0xb
[ 0.000000] [<ffffffff82720b1a>] start_kernel+0x13c/0x313
[ 0.000000] [<ffffffff82720295>] x86_64_start_reservations+0xa5/0xa9
[ 0.000000] [<ffffffff8272037a>] x86_64_start_kernel+0xe1/0xe8
[ 0.000000] ------------[ cut here ]------------

2009-12-14 11:35:14

by FUJITA Tomonori

[permalink] [raw]
Subject: Re: [tip:x86/urgent] x86: Move swiotlb initialization before dma32_free_bootmem

On Mon, 14 Dec 2009 03:00:19 -0800
Yinghai Lu <[email protected]> wrote:

> tip-bot for FUJITA Tomonori wrote:
> > Commit-ID: f4780ca005404166cc40af77ef0e86132ab98a81
> > Gitweb: http://git.kernel.org/tip/f4780ca005404166cc40af77ef0e86132ab98a81
> > Author: FUJITA Tomonori <[email protected]>
> > AuthorDate: Mon, 14 Dec 2009 11:52:14 +0900
> > Committer: Ingo Molnar <[email protected]>
> > CommitDate: Mon, 14 Dec 2009 08:57:40 +0100
> >
> > x86: Move swiotlb initialization before dma32_free_bootmem
> >
> > The commit 75f1cdf1dda92cae037ec848ae63690d91913eac introduced a
> > bug that we initialize SWIOTLB right after dma32_free_bootmem so
> > we wrongly steal memory area allocated for GART with broken BIOS
> > earlier.
> >
> > This moves swiotlb initialization before dma32_free_bootmem().
> >
> > Signed-off-by: FUJITA Tomonori <[email protected]>
> > Cc: [email protected]
> > LKML-Reference: <[email protected]>
> > Signed-off-by: Ingo Molnar <[email protected]>
> > ---
> > arch/x86/kernel/pci-dma.c | 5 ++++-
> > 1 files changed, 4 insertions(+), 1 deletions(-)
> >
> > diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
> > index afcc58b..fcc2f2b 100644
> > --- a/arch/x86/kernel/pci-dma.c
> > +++ b/arch/x86/kernel/pci-dma.c
> > @@ -120,11 +120,14 @@ static void __init dma32_free_bootmem(void)
> >
> > void __init pci_iommu_alloc(void)
> > {
> > + int use_swiotlb;
> > +
> > + use_swiotlb = pci_swiotlb_init();
> > #ifdef CONFIG_X86_64
> > /* free the range so iommu could get some range less than 4G */
> > dma32_free_bootmem();
> > #endif
> > - if (pci_swiotlb_init())
> > + if (use_swiotlb)
> > return;
> >
> > gart_iommu_hole_init();
> > --
>
> on one 1t socket system numa is enabled, got

Doh, sorry, seems that we have to go with the previous approach...

How much memory does your box? VMEMMAP enabled?

Can you please send the full boot log and your config?


Thanks,

> [ 0.000000] bootmem alloc of 67108864 bytes failed!
> [ 0.000000] Kernel panic - not syncing: Out of memory
> [ 0.000000] Pid: 0, comm: swapper Not tainted 2.6.32-tip-yh-08635-g3614d3a-dirty #902
> [ 0.000000] Call Trace:
> [ 0.000000] [<ffffffff81cb3fec>] panic+0x7a/0x136
> [ 0.000000] [<ffffffff81cb7819>] ? _raw_spin_unlock_irqrestore+0x3d/0x4c
> [ 0.000000] [<ffffffff810a5866>] ? trace_hardirqs_off+0xd/0xf
> [ 0.000000] [<ffffffff8273d055>] ___alloc_bootmem_node+0x0/0x61
> [ 0.000000] [<ffffffff8273d1c1>] __alloc_bootmem_low+0xe/0x10
> [ 0.000000] [<ffffffff82749b4b>] swiotlb_init_with_default_size+0x41/0x127
> [ 0.000000] [<ffffffff82749c41>] swiotlb_init+0x10/0x12
> [ 0.000000] [<ffffffff827326ef>] pci_swiotlb_init+0x52/0x67
> [ 0.000000] [<ffffffff81cbfbfc>] ? _etext+0x0/0x24
> [ 0.000000] [<ffffffff827276dd>] pci_iommu_alloc+0xc/0x6e
> [ 0.000000] [<ffffffff81cbfbfc>] ? _etext+0x0/0x24
> [ 0.000000] [<ffffffff82735037>] mem_init+0x19/0xec
> [ 0.000000] [<ffffffff81caa52e>] ? x86_init_noop+0x9/0xb
> [ 0.000000] [<ffffffff82720b1a>] start_kernel+0x13c/0x313
> [ 0.000000] [<ffffffff82720295>] x86_64_start_reservations+0xa5/0xa9
> [ 0.000000] [<ffffffff8272037a>] x86_64_start_kernel+0xe1/0xe8
> [ 0.000000] ------------[ cut here ]------------
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/