2015-12-04 13:07:14

by Igor Mammedov

[permalink] [raw]
Subject: [PATCH v3 0/2] x86: enable SWIOTLB if system has SRAT memory regions above MAX_DMA32_PFN

changes since v2:
- moved 'max_possible_pfn' tracking hunk to the 1st patch
- make 'max_possible_pfn' 64-bit
- simplify condition to oneliner as suggested by Ingo
changes since v1:
- reduce # of #ifdefs by introducing max_possible_pfn
global variable
- don't check 'acpi_no_memhotplug=1' for disabling
SWIOTLB initialization, since existing 'no_iommu'
kernel option could be used to the same effect.
- split into 2 patches
- 1st adds max_possible_pfn,
- 2nd fixes bug enabling SWIOTLB when it's needed

when memory hotplug enabled system is booted with less
than 4GB of RAM and then later more RAM is hotplugged
32-bit devices stop functioning with following error:

nommu_map_single: overflow 327b4f8c0+1522 of device mask ffffffff

the reason for this is that if x86_64 system were booted
with RAM less than 4GB, it doesn't enable SWIOTLB and
when memory is hotplugged beyond MAX_DMA32_PFN, devices
that expect 32-bit addresses can't handle 64-bit addresses.

Fix it by tracking max possible PFN when parsing
memory affinity structures from SRAT ACPI table and
enable SWIOTLB if there is hotpluggable memory
regions beyond MAX_DMA32_PFN.

It fixes KVM guests when they use emulated devices
(reproduces with ata_piix, e1000 and usb devices,
RHBZ: 1275941, 1275977, 1271527)
It also fixes the HyperV, VMWare with emulated devices
which are affected by this issue as well.

ref to v2:
https://lkml.org/lkml/2015/12/4/151
ref to v1:
https://lkml.org/lkml/2015/11/30/594

Igor Mammedov (2):
x86: introduce max_possible_pfn
x86_64: enable SWIOTLB if system has SRAT memory regions above
MAX_DMA32_PFN

arch/x86/kernel/pci-swiotlb.c | 2 +-
arch/x86/kernel/setup.c | 2 ++
arch/x86/mm/srat.c | 2 ++
include/linux/bootmem.h | 4 ++++
mm/bootmem.c | 1 +
mm/nobootmem.c | 1 +
6 files changed, 11 insertions(+), 1 deletion(-)

--
1.8.3.1


2015-12-04 13:07:17

by Igor Mammedov

[permalink] [raw]
Subject: [PATCH v3 1/2] x86: introduce max_possible_pfn

max_possible_pfn will be used for tracking max possible
PFN for memory that isn't present in E820 table and
could be hotplugged later.

By default max_possible_pfn is initialized with max_pfn,
but later it could be updated with highest PFN of
hotpluggable memory ranges declared in ACPI SRAT table
if any present.

Signed-off-by: Igor Mammedov <[email protected]>
---
v3:
- make 'max_possible_pfn' 64-bit
- simplify condition to oneliner as suggested by Ingo
---
arch/x86/kernel/setup.c | 2 ++
arch/x86/mm/srat.c | 2 ++
include/linux/bootmem.h | 4 ++++
mm/bootmem.c | 1 +
mm/nobootmem.c | 1 +
5 files changed, 10 insertions(+)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 29db25f..16a8465 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1048,6 +1048,8 @@ void __init setup_arch(char **cmdline_p)
if (mtrr_trim_uncached_memory(max_pfn))
max_pfn = e820_end_of_ram_pfn();

+ max_possible_pfn = max_pfn;
+
#ifdef CONFIG_X86_32
/* max_low_pfn get updated here */
find_low_pfn_range();
diff --git a/arch/x86/mm/srat.c b/arch/x86/mm/srat.c
index c2aea63..b5f8218 100644
--- a/arch/x86/mm/srat.c
+++ b/arch/x86/mm/srat.c
@@ -203,6 +203,8 @@ acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma)
pr_warn("SRAT: Failed to mark hotplug range [mem %#010Lx-%#010Lx] in memblock\n",
(unsigned long long)start, (unsigned long long)end - 1);

+ max_possible_pfn = max(max_possible_pfn, PFN_UP(end - 1));
+
return 0;
out_err_bad_srat:
bad_srat();
diff --git a/include/linux/bootmem.h b/include/linux/bootmem.h
index f589222..35b22f9 100644
--- a/include/linux/bootmem.h
+++ b/include/linux/bootmem.h
@@ -19,6 +19,10 @@ extern unsigned long min_low_pfn;
* highest page
*/
extern unsigned long max_pfn;
+/*
+ * highest possible page
+ */
+extern unsigned long long max_possible_pfn;

#ifndef CONFIG_NO_BOOTMEM
/*
diff --git a/mm/bootmem.c b/mm/bootmem.c
index 3b63807..91e32bc 100644
--- a/mm/bootmem.c
+++ b/mm/bootmem.c
@@ -33,6 +33,7 @@ EXPORT_SYMBOL(contig_page_data);
unsigned long max_low_pfn;
unsigned long min_low_pfn;
unsigned long max_pfn;
+unsigned long long max_possible_pfn;

bootmem_data_t bootmem_node_data[MAX_NUMNODES] __initdata;

diff --git a/mm/nobootmem.c b/mm/nobootmem.c
index e57cf24..99feb2b 100644
--- a/mm/nobootmem.c
+++ b/mm/nobootmem.c
@@ -31,6 +31,7 @@ EXPORT_SYMBOL(contig_page_data);
unsigned long max_low_pfn;
unsigned long min_low_pfn;
unsigned long max_pfn;
+unsigned long long max_possible_pfn;

static void * __init __alloc_memory_core_early(int nid, u64 size, u64 align,
u64 goal, u64 limit)
--
1.8.3.1

2015-12-04 13:07:22

by Igor Mammedov

[permalink] [raw]
Subject: [PATCH v3 2/2] x86_64: enable SWIOTLB if system has SRAT memory regions above MAX_DMA32_PFN

when memory hotplug enabled system is booted with less
than 4GB of RAM and then later more RAM is hotplugged
32-bit devices stop functioning with following error:

nommu_map_single: overflow 327b4f8c0+1522 of device mask ffffffff

the reason for this is that if x86_64 system were booted
with RAM less than 4GB, it doesn't enable SWIOTLB and
when memory is hotplugged beyond MAX_DMA32_PFN, devices
that expect 32-bit addresses can't handle 64-bit addresses.

Fix it by tracking max possible PFN when parsing
memory affinity structures from SRAT ACPI table and
enable SWIOTLB if there is hotpluggable memory
regions beyond MAX_DMA32_PFN.

It fixes KVM guests when they use emulated devices
(reproduces with ata_piix, e1000 and usb devices,
RHBZ: 1275941, 1275977, 1271527)
It also fixes the HyperV, VMWare with emulated devices
which are affected by this issue as well.

Signed-off-by: Igor Mammedov <[email protected]>
---
v3:
- moved 'max_possible_pfn' tracking hunk to the patch
that introduces it.
---
arch/x86/kernel/pci-swiotlb.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
index adf0392..7c577a1 100644
--- a/arch/x86/kernel/pci-swiotlb.c
+++ b/arch/x86/kernel/pci-swiotlb.c
@@ -88,7 +88,7 @@ int __init pci_swiotlb_detect_4gb(void)
{
/* don't initialize swiotlb if iommu=off (no_iommu=1) */
#ifdef CONFIG_X86_64
- if (!no_iommu && max_pfn > MAX_DMA32_PFN)
+ if (!no_iommu && max_possible_pfn > MAX_DMA32_PFN)
swiotlb = 1;
#endif
return swiotlb;
--
1.8.3.1

Subject: [tip:x86/mm] x86/mm: Introduce max_possible_pfn

Commit-ID: 8dd3303001976aa8583bf20f6b93590c74114308
Gitweb: http://git.kernel.org/tip/8dd3303001976aa8583bf20f6b93590c74114308
Author: Igor Mammedov <[email protected]>
AuthorDate: Fri, 4 Dec 2015 14:07:05 +0100
Committer: Ingo Molnar <[email protected]>
CommitDate: Sun, 6 Dec 2015 12:46:31 +0100

x86/mm: Introduce max_possible_pfn

max_possible_pfn will be used for tracking max possible
PFN for memory that isn't present in E820 table and
could be hotplugged later.

By default max_possible_pfn is initialized with max_pfn,
but later it could be updated with highest PFN of
hotpluggable memory ranges declared in ACPI SRAT table
if any present.

Signed-off-by: Igor Mammedov <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/kernel/setup.c | 2 ++
arch/x86/mm/srat.c | 2 ++
include/linux/bootmem.h | 4 ++++
mm/bootmem.c | 1 +
mm/nobootmem.c | 1 +
5 files changed, 10 insertions(+)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 29db25f..16a8465 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1048,6 +1048,8 @@ void __init setup_arch(char **cmdline_p)
if (mtrr_trim_uncached_memory(max_pfn))
max_pfn = e820_end_of_ram_pfn();

+ max_possible_pfn = max_pfn;
+
#ifdef CONFIG_X86_32
/* max_low_pfn get updated here */
find_low_pfn_range();
diff --git a/arch/x86/mm/srat.c b/arch/x86/mm/srat.c
index c2aea63..b5f8218 100644
--- a/arch/x86/mm/srat.c
+++ b/arch/x86/mm/srat.c
@@ -203,6 +203,8 @@ acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma)
pr_warn("SRAT: Failed to mark hotplug range [mem %#010Lx-%#010Lx] in memblock\n",
(unsigned long long)start, (unsigned long long)end - 1);

+ max_possible_pfn = max(max_possible_pfn, PFN_UP(end - 1));
+
return 0;
out_err_bad_srat:
bad_srat();
diff --git a/include/linux/bootmem.h b/include/linux/bootmem.h
index f589222..35b22f9 100644
--- a/include/linux/bootmem.h
+++ b/include/linux/bootmem.h
@@ -19,6 +19,10 @@ extern unsigned long min_low_pfn;
* highest page
*/
extern unsigned long max_pfn;
+/*
+ * highest possible page
+ */
+extern unsigned long long max_possible_pfn;

#ifndef CONFIG_NO_BOOTMEM
/*
diff --git a/mm/bootmem.c b/mm/bootmem.c
index 3b63807..91e32bc 100644
--- a/mm/bootmem.c
+++ b/mm/bootmem.c
@@ -33,6 +33,7 @@ EXPORT_SYMBOL(contig_page_data);
unsigned long max_low_pfn;
unsigned long min_low_pfn;
unsigned long max_pfn;
+unsigned long long max_possible_pfn;

bootmem_data_t bootmem_node_data[MAX_NUMNODES] __initdata;

diff --git a/mm/nobootmem.c b/mm/nobootmem.c
index e57cf24..99feb2b 100644
--- a/mm/nobootmem.c
+++ b/mm/nobootmem.c
@@ -31,6 +31,7 @@ EXPORT_SYMBOL(contig_page_data);
unsigned long max_low_pfn;
unsigned long min_low_pfn;
unsigned long max_pfn;
+unsigned long long max_possible_pfn;

static void * __init __alloc_memory_core_early(int nid, u64 size, u64 align,
u64 goal, u64 limit)

Subject: [tip:x86/mm] x86/mm/64: Enable SWIOTLB if system has SRAT memory regions above MAX_DMA32_PFN

Commit-ID: ec941c5ffede4d788b9fc008f9eeca75b9e964f5
Gitweb: http://git.kernel.org/tip/ec941c5ffede4d788b9fc008f9eeca75b9e964f5
Author: Igor Mammedov <[email protected]>
AuthorDate: Fri, 4 Dec 2015 14:07:06 +0100
Committer: Ingo Molnar <[email protected]>
CommitDate: Sun, 6 Dec 2015 12:46:31 +0100

x86/mm/64: Enable SWIOTLB if system has SRAT memory regions above MAX_DMA32_PFN

when memory hotplug enabled system is booted with less
than 4GB of RAM and then later more RAM is hotplugged
32-bit devices stop functioning with following error:

nommu_map_single: overflow 327b4f8c0+1522 of device mask ffffffff

the reason for this is that if x86_64 system were booted
with RAM less than 4GB, it doesn't enable SWIOTLB and
when memory is hotplugged beyond MAX_DMA32_PFN, devices
that expect 32-bit addresses can't handle 64-bit addresses.

Fix it by tracking max possible PFN when parsing
memory affinity structures from SRAT ACPI table and
enable SWIOTLB if there is hotpluggable memory
regions beyond MAX_DMA32_PFN.

It fixes KVM guests when they use emulated devices
(reproduces with ata_piix, e1000 and usb devices,
RHBZ: 1275941, 1275977, 1271527)

It also fixes the HyperV, VMWare with emulated devices
which are affected by this issue as well.

Signed-off-by: Igor Mammedov <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/kernel/pci-swiotlb.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
index adf0392..7c577a1 100644
--- a/arch/x86/kernel/pci-swiotlb.c
+++ b/arch/x86/kernel/pci-swiotlb.c
@@ -88,7 +88,7 @@ int __init pci_swiotlb_detect_4gb(void)
{
/* don't initialize swiotlb if iommu=off (no_iommu=1) */
#ifdef CONFIG_X86_64
- if (!no_iommu && max_pfn > MAX_DMA32_PFN)
+ if (!no_iommu && max_possible_pfn > MAX_DMA32_PFN)
swiotlb = 1;
#endif
return swiotlb;