changes since v1:
- reduce # of #ifdefs by introducing max_possible_pfn
global variable
- don't check 'acpi_no_memhotplug=1' for disabling
SWIOTLB initialization, since existing 'no_iommu'
kernel option could be used to the same effect.
- split into 2 patches
- 1st adds max_possible_pfn,
- 2nd fixes bug enabling SWIOTLB when it's needed
when memory hotplug enabled system is booted with less
than 4GB of RAM and then later more RAM is hotplugged
32-bit devices stop functioning with following error:
nommu_map_single: overflow 327b4f8c0+1522 of device mask ffffffff
the reason for this is that if x86_64 system were booted
with RAM less than 4GB, it doesn't enable SWIOTLB and
when memory is hotplugged beyond MAX_DMA32_PFN, devices
that expect 32-bit addresses can't handle 64-bit addresses.
Fix it by tracking max possible PFN when parsing
memory affinity structures from SRAT ACPI table and
enable SWIOTLB if there is hotpluggable memory
regions beyond MAX_DMA32_PFN.
It fixes KVM guests when they use emulated devices
(reproduces with ata_piix, e1000 and usb devices,
RHBZ: 1275941, 1275977, 1271527)
It also fixes the HyperV, VMWare with emulated devices
which are affected by this issue as well.
ref to v1:
https://lkml.org/lkml/2015/11/30/594
Igor Mammedov (2):
x86: introduce max_possible_pfn
x86_64: enable SWIOTLB if system has SRAT memory regions above
MAX_DMA32_PFN
arch/x86/kernel/pci-swiotlb.c | 2 +-
arch/x86/kernel/setup.c | 2 ++
arch/x86/mm/srat.c | 3 +++
include/linux/bootmem.h | 4 ++++
mm/bootmem.c | 1 +
mm/nobootmem.c | 1 +
6 files changed, 12 insertions(+), 1 deletion(-)
--
1.8.3.1
max_possible_pfn will be used for tracking max possible
PFN for memory that isn't present in E820 table and
could be hotplugged later.
By default max_possible_pfn is initialized with max_pfn,
but a follow up patch will update it with highest PFN of
hotpluggable memory ranges declared in ACPI SRAT table
if any present.
Signed-off-by: Igor Mammedov <[email protected]>
---
arch/x86/kernel/setup.c | 2 ++
include/linux/bootmem.h | 4 ++++
mm/bootmem.c | 1 +
mm/nobootmem.c | 1 +
4 files changed, 8 insertions(+)
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 29db25f..16a8465 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1048,6 +1048,8 @@ void __init setup_arch(char **cmdline_p)
if (mtrr_trim_uncached_memory(max_pfn))
max_pfn = e820_end_of_ram_pfn();
+ max_possible_pfn = max_pfn;
+
#ifdef CONFIG_X86_32
/* max_low_pfn get updated here */
find_low_pfn_range();
diff --git a/include/linux/bootmem.h b/include/linux/bootmem.h
index f589222..5b39aa5 100644
--- a/include/linux/bootmem.h
+++ b/include/linux/bootmem.h
@@ -19,6 +19,10 @@ extern unsigned long min_low_pfn;
* highest page
*/
extern unsigned long max_pfn;
+/*
+ * highest possible page
+ */
+extern unsigned long max_possible_pfn;
#ifndef CONFIG_NO_BOOTMEM
/*
diff --git a/mm/bootmem.c b/mm/bootmem.c
index 3b63807..221a04b 100644
--- a/mm/bootmem.c
+++ b/mm/bootmem.c
@@ -33,6 +33,7 @@ EXPORT_SYMBOL(contig_page_data);
unsigned long max_low_pfn;
unsigned long min_low_pfn;
unsigned long max_pfn;
+unsigned long max_possible_pfn;
bootmem_data_t bootmem_node_data[MAX_NUMNODES] __initdata;
diff --git a/mm/nobootmem.c b/mm/nobootmem.c
index e57cf24..08614c8 100644
--- a/mm/nobootmem.c
+++ b/mm/nobootmem.c
@@ -31,6 +31,7 @@ EXPORT_SYMBOL(contig_page_data);
unsigned long max_low_pfn;
unsigned long min_low_pfn;
unsigned long max_pfn;
+unsigned long max_possible_pfn;
static void * __init __alloc_memory_core_early(int nid, u64 size, u64 align,
u64 goal, u64 limit)
--
1.8.3.1
when memory hotplug enabled system is booted with less
than 4GB of RAM and then later more RAM is hotplugged
32-bit devices stop functioning with following error:
nommu_map_single: overflow 327b4f8c0+1522 of device mask ffffffff
the reason for this is that if x86_64 system were booted
with RAM less than 4GB, it doesn't enable SWIOTLB and
when memory is hotplugged beyond MAX_DMA32_PFN, devices
that expect 32-bit addresses can't handle 64-bit addresses.
Fix it by tracking max possible PFN when parsing
memory affinity structures from SRAT ACPI table and
enable SWIOTLB if there is hotpluggable memory
regions beyond MAX_DMA32_PFN.
It fixes KVM guests when they use emulated devices
(reproduces with ata_piix, e1000 and usb devices,
RHBZ: 1275941, 1275977, 1271527)
It also fixes the HyperV, VMWare with emulated devices
which are affected by this issue as well.
Signed-off-by: Igor Mammedov <[email protected]>
---
arch/x86/kernel/pci-swiotlb.c | 2 +-
arch/x86/mm/srat.c | 3 +++
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
index adf0392..7c577a1 100644
--- a/arch/x86/kernel/pci-swiotlb.c
+++ b/arch/x86/kernel/pci-swiotlb.c
@@ -88,7 +88,7 @@ int __init pci_swiotlb_detect_4gb(void)
{
/* don't initialize swiotlb if iommu=off (no_iommu=1) */
#ifdef CONFIG_X86_64
- if (!no_iommu && max_pfn > MAX_DMA32_PFN)
+ if (!no_iommu && max_possible_pfn > MAX_DMA32_PFN)
swiotlb = 1;
#endif
return swiotlb;
diff --git a/arch/x86/mm/srat.c b/arch/x86/mm/srat.c
index c2aea63..a26bdbe 100644
--- a/arch/x86/mm/srat.c
+++ b/arch/x86/mm/srat.c
@@ -203,6 +203,9 @@ acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma)
pr_warn("SRAT: Failed to mark hotplug range [mem %#010Lx-%#010Lx] in memblock\n",
(unsigned long long)start, (unsigned long long)end - 1);
+ if (max_possible_pfn < PFN_UP(end - 1))
+ max_possible_pfn = PFN_UP(end - 1);
+
return 0;
out_err_bad_srat:
bad_srat();
--
1.8.3.1
* Igor Mammedov <[email protected]> wrote:
> when memory hotplug enabled system is booted with less
> than 4GB of RAM and then later more RAM is hotplugged
> 32-bit devices stop functioning with following error:
>
> nommu_map_single: overflow 327b4f8c0+1522 of device mask ffffffff
>
> the reason for this is that if x86_64 system were booted
> with RAM less than 4GB, it doesn't enable SWIOTLB and
> when memory is hotplugged beyond MAX_DMA32_PFN, devices
> that expect 32-bit addresses can't handle 64-bit addresses.
>
> Fix it by tracking max possible PFN when parsing
> memory affinity structures from SRAT ACPI table and
> enable SWIOTLB if there is hotpluggable memory
> regions beyond MAX_DMA32_PFN.
>
> It fixes KVM guests when they use emulated devices
> (reproduces with ata_piix, e1000 and usb devices,
> RHBZ: 1275941, 1275977, 1271527)
> It also fixes the HyperV, VMWare with emulated devices
> which are affected by this issue as well.
>
> Signed-off-by: Igor Mammedov <[email protected]>
> ---
> arch/x86/kernel/pci-swiotlb.c | 2 +-
> arch/x86/mm/srat.c | 3 +++
> 2 files changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
> index adf0392..7c577a1 100644
> --- a/arch/x86/kernel/pci-swiotlb.c
> +++ b/arch/x86/kernel/pci-swiotlb.c
> @@ -88,7 +88,7 @@ int __init pci_swiotlb_detect_4gb(void)
> {
> /* don't initialize swiotlb if iommu=off (no_iommu=1) */
> #ifdef CONFIG_X86_64
> - if (!no_iommu && max_pfn > MAX_DMA32_PFN)
> + if (!no_iommu && max_possible_pfn > MAX_DMA32_PFN)
> swiotlb = 1;
Ok, this series looks a lot better!
> #endif
> return swiotlb;
> diff --git a/arch/x86/mm/srat.c b/arch/x86/mm/srat.c
> index c2aea63..a26bdbe 100644
> --- a/arch/x86/mm/srat.c
> +++ b/arch/x86/mm/srat.c
> @@ -203,6 +203,9 @@ acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma)
> pr_warn("SRAT: Failed to mark hotplug range [mem %#010Lx-%#010Lx] in memblock\n",
> (unsigned long long)start, (unsigned long long)end - 1);
>
> + if (max_possible_pfn < PFN_UP(end - 1))
> + max_possible_pfn = PFN_UP(end - 1);
Small nit, why not write this as something like:
max_possible_pfn = max(max_possible_pfn, PFN_UP(end - 1));
?
I'd also move this second hunk to the first patch, because logically it belongs
there (or into a third patch).
Thanks,
Ingo
On Fri, 4 Dec 2015 12:49:49 +0100
Ingo Molnar <[email protected]> wrote:
>
> * Igor Mammedov <[email protected]> wrote:
>
> > when memory hotplug enabled system is booted with less
> > than 4GB of RAM and then later more RAM is hotplugged
> > 32-bit devices stop functioning with following error:
> >
> > nommu_map_single: overflow 327b4f8c0+1522 of device mask ffffffff
> >
> > the reason for this is that if x86_64 system were booted
> > with RAM less than 4GB, it doesn't enable SWIOTLB and
> > when memory is hotplugged beyond MAX_DMA32_PFN, devices
> > that expect 32-bit addresses can't handle 64-bit addresses.
> >
> > Fix it by tracking max possible PFN when parsing
> > memory affinity structures from SRAT ACPI table and
> > enable SWIOTLB if there is hotpluggable memory
> > regions beyond MAX_DMA32_PFN.
> >
> > It fixes KVM guests when they use emulated devices
> > (reproduces with ata_piix, e1000 and usb devices,
> > RHBZ: 1275941, 1275977, 1271527)
> > It also fixes the HyperV, VMWare with emulated devices
> > which are affected by this issue as well.
> >
> > Signed-off-by: Igor Mammedov <[email protected]>
> > ---
> > arch/x86/kernel/pci-swiotlb.c | 2 +-
> > arch/x86/mm/srat.c | 3 +++
> > 2 files changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
> > index adf0392..7c577a1 100644
> > --- a/arch/x86/kernel/pci-swiotlb.c
> > +++ b/arch/x86/kernel/pci-swiotlb.c
> > @@ -88,7 +88,7 @@ int __init pci_swiotlb_detect_4gb(void)
> > {
> > /* don't initialize swiotlb if iommu=off (no_iommu=1) */
> > #ifdef CONFIG_X86_64
> > - if (!no_iommu && max_pfn > MAX_DMA32_PFN)
> > + if (!no_iommu && max_possible_pfn > MAX_DMA32_PFN)
> > swiotlb = 1;
>
> Ok, this series looks a lot better!
>
> > #endif
> > return swiotlb;
> > diff --git a/arch/x86/mm/srat.c b/arch/x86/mm/srat.c
> > index c2aea63..a26bdbe 100644
> > --- a/arch/x86/mm/srat.c
> > +++ b/arch/x86/mm/srat.c
> > @@ -203,6 +203,9 @@ acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma)
> > pr_warn("SRAT: Failed to mark hotplug range [mem %#010Lx-%#010Lx] in memblock\n",
> > (unsigned long long)start, (unsigned long long)end - 1);
> >
> > + if (max_possible_pfn < PFN_UP(end - 1))
> > + max_possible_pfn = PFN_UP(end - 1);
>
> Small nit, why not write this as something like:
>
> max_possible_pfn = max(max_possible_pfn, PFN_UP(end - 1));
>
> ?
>
> I'd also move this second hunk to the first patch, because logically it belongs
> there (or into a third patch).
sure, I'll move it to the 1st patch
>
> Thanks,
>
> Ingo
On Fri, 4 Dec 2015 12:49:49 +0100
Ingo Molnar <[email protected]> wrote:
>
> * Igor Mammedov <[email protected]> wrote:
>
> > when memory hotplug enabled system is booted with less
> > than 4GB of RAM and then later more RAM is hotplugged
> > 32-bit devices stop functioning with following error:
> >
> > nommu_map_single: overflow 327b4f8c0+1522 of device mask ffffffff
> >
> > the reason for this is that if x86_64 system were booted
> > with RAM less than 4GB, it doesn't enable SWIOTLB and
> > when memory is hotplugged beyond MAX_DMA32_PFN, devices
> > that expect 32-bit addresses can't handle 64-bit addresses.
> >
> > Fix it by tracking max possible PFN when parsing
> > memory affinity structures from SRAT ACPI table and
> > enable SWIOTLB if there is hotpluggable memory
> > regions beyond MAX_DMA32_PFN.
> >
> > It fixes KVM guests when they use emulated devices
> > (reproduces with ata_piix, e1000 and usb devices,
> > RHBZ: 1275941, 1275977, 1271527)
> > It also fixes the HyperV, VMWare with emulated devices
> > which are affected by this issue as well.
> >
> > Signed-off-by: Igor Mammedov <[email protected]>
> > ---
> > arch/x86/kernel/pci-swiotlb.c | 2 +-
> > arch/x86/mm/srat.c | 3 +++
> > 2 files changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
> > index adf0392..7c577a1 100644
> > --- a/arch/x86/kernel/pci-swiotlb.c
> > +++ b/arch/x86/kernel/pci-swiotlb.c
> > @@ -88,7 +88,7 @@ int __init pci_swiotlb_detect_4gb(void)
> > {
> > /* don't initialize swiotlb if iommu=off (no_iommu=1) */
> > #ifdef CONFIG_X86_64
> > - if (!no_iommu && max_pfn > MAX_DMA32_PFN)
> > + if (!no_iommu && max_possible_pfn > MAX_DMA32_PFN)
> > swiotlb = 1;
>
> Ok, this series looks a lot better!
>
> > #endif
> > return swiotlb;
> > diff --git a/arch/x86/mm/srat.c b/arch/x86/mm/srat.c
> > index c2aea63..a26bdbe 100644
> > --- a/arch/x86/mm/srat.c
> > +++ b/arch/x86/mm/srat.c
> > @@ -203,6 +203,9 @@ acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma)
> > pr_warn("SRAT: Failed to mark hotplug range [mem %#010Lx-%#010Lx] in memblock\n",
> > (unsigned long long)start, (unsigned long long)end - 1);
> >
> > + if (max_possible_pfn < PFN_UP(end - 1))
> > + max_possible_pfn = PFN_UP(end - 1);
>
> Small nit, why not write this as something like:
>
> max_possible_pfn = max(max_possible_pfn, PFN_UP(end - 1));
That doesn't pass strict type check:
include/linux/kernel.h:730:17: warning: comparison of distinct pointer types lacks a cast [enabled by default]
(void) (&_max1 == &_max2); \
^
arch/x86/mm/srat.c:206:21: note: in expansion of macro ‘max’
max_possible_pfn = max(max_possible_pfn, PFN_UP(end - 1));
I can change max_possible_pfn to u64 to match PFN_UP(end - 1) type.
>
> ?
>
> I'd also move this second hunk to the first patch, because logically it belongs
> there (or into a third patch).
>
> Thanks,
>
> Ingo
On 04/12/2015 13:33, Igor Mammedov wrote:
> That doesn't pass strict type check:
>
> include/linux/kernel.h:730:17: warning: comparison of distinct pointer types lacks a cast [enabled by default]
> (void) (&_max1 == &_max2); \
> ^
> arch/x86/mm/srat.c:206:21: note: in expansion of macro ‘max’
> max_possible_pfn = max(max_possible_pfn, PFN_UP(end - 1));
>
> I can change max_possible_pfn to u64 to match PFN_UP(end - 1) type.
Sounds like a good idea anyway.
Paolo
* Paolo Bonzini <[email protected]> wrote:
> On 04/12/2015 13:33, Igor Mammedov wrote:
> > That doesn't pass strict type check:
> >
> > include/linux/kernel.h:730:17: warning: comparison of distinct pointer types lacks a cast [enabled by default]
> > (void) (&_max1 == &_max2); \
> > ^
> > arch/x86/mm/srat.c:206:21: note: in expansion of macro ‘max’
> > max_possible_pfn = max(max_possible_pfn, PFN_UP(end - 1));
> >
> > I can change max_possible_pfn to u64 to match PFN_UP(end - 1) type.
>
> Sounds like a good idea anyway.
Yeah, agreed.
Thanks,
Ingo