2012-11-16 21:53:50

by Duyck, Alexander H

[permalink] [raw]
Subject: [PATCH v4 0/8] Improve performance of VM translation on x86_64

This patch series is meant to address several issues I encountered with VM
translations on x86_64. In my testing I found that swiotlb was incurring up
to a 5% processing overhead due to calls to __phys_addr. To address that I
have updated swiotlb to use physical addresses instead of virtual addresses
to reduce the need to call __phys_addr. However those patches didn't address
the other callers. With these patches applied I am able to achieve an
additional 1% to 2% performance gain on top of the changes to swiotlb.

The first 2 patches are the performance optimizations that result in the 1% to
2% increase in overall performance. The remaining patches are various
cleanups for a number of spots where __pa or virt_to_phys was being called
and was not needed or __pa_symbol could have been used.

It doesn't seem like the v2 patch set was accepted so I am submitting an
updated v3 set that is rebased off of linux-next with a few additional
improvements to the existing patches. Specifically the first patch now also
updates __virt_addr_valid so that it is almost identical in layout to
__phys_addr. Also I found one additional spot in init_64.c that could use
__pa_symbol instead of virt_to_page calls so I updated the first __pa_symbol
patch for the x86 init calls.

With this patch set applied I am noticing a 1-2% improvement in performance in
my routing tests. Without my earlier swiotlb changes applied it was getting
as high as 6-7% because that code originally relied heavily on virt_to_phys.

The overall effect on size varies depending on what kernel options are
enabled. I have notices that almost all of the network device drivers have
dropped in size by around 100 bytes. I suspect this is due to the fact that
the virt_to_page call in dma_map_single is now less expensive. However the
default build for x86_64 increases the vmlinux size by 3.5K with this change
applied.

v2: Rebased changes onto linux-next due to changes in x86/xen tree.
v3: Changes to __virt_addr_valid so it was in sync with __phys_addr.
Changes to init_64.c function mark_rodata_ro to avoid virt_to_page calls.
v4: Spun x86/xen changes off as a separate patch.
Added new patch to push address translation into page_64.h.
Minor change to __phys_addr_symbol to avoid unnecessary second > check.
---

Alexander Duyck (8):
x86: Move some contents of page_64_types.h into pgtable_64.h and page_64.h
x86: Improve __phys_addr performance by making use of carry flags and inlining
x86: Make it so that __pa_symbol can only process kernel symbols on x86_64
x86: Drop 4 unnecessary calls to __pa_symbol
x86: Use __pa_symbol instead of __pa on C visible symbols
x86/ftrace: Use __pa_symbol instead of __pa on C visible symbols
x86/acpi: Use __pa_symbol instead of __pa on C visible symbols
x86/lguest: Use __pa_symbol instead of __pa on C visible symbols


arch/x86/include/asm/page.h | 3 +-
arch/x86/include/asm/page_32.h | 1 +
arch/x86/include/asm/page_64.h | 36 ++++++++++++++++++++++++
arch/x86/include/asm/page_64_types.h | 22 ---------------
arch/x86/include/asm/pgtable_64.h | 5 +++
arch/x86/kernel/acpi/sleep.c | 2 +
arch/x86/kernel/cpu/intel.c | 2 +
arch/x86/kernel/ftrace.c | 4 +--
arch/x86/kernel/head32.c | 4 +--
arch/x86/kernel/head64.c | 4 +--
arch/x86/kernel/setup.c | 16 +++++------
arch/x86/kernel/x8664_ksyms_64.c | 3 ++
arch/x86/lguest/boot.c | 3 +-
arch/x86/mm/init_64.c | 18 +++++-------
arch/x86/mm/pageattr.c | 8 +++--
arch/x86/mm/physaddr.c | 51 ++++++++++++++++++++++++----------
arch/x86/platform/efi/efi.c | 4 +--
arch/x86/realmode/init.c | 8 +++--
18 files changed, 119 insertions(+), 75 deletions(-)

--
Signature


2012-11-16 21:54:21

by Duyck, Alexander H

[permalink] [raw]
Subject: [PATCH v4 1/8] x86: Move some contents of page_64_types.h into pgtable_64.h and page_64.h

This patch is meant to clean-up the fact that we have several functions in
page_64_types.h which really don't belong there. I found this issue when I
had tried to replace __phys_addr with an inline function. It resulted in the
realmode bits generating compile warnings about types. In order to resolve
that I am relocating the address translation to page_64.h since this is in
keeping with where these functions are located in 32 bit.

In addtion I have relocated several functions defined in init_64.c to
pgtable_64.h as this seems to be where most of the functions related to
memory initialization were already located.

Signed-off-by: Alexander Duyck <[email protected]>
---
arch/x86/include/asm/page_64.h | 19 +++++++++++++++++++
arch/x86/include/asm/page_64_types.h | 22 ----------------------
arch/x86/include/asm/pgtable_64.h | 5 +++++
3 files changed, 24 insertions(+), 22 deletions(-)

diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h
index 072694e..4150999 100644
--- a/arch/x86/include/asm/page_64.h
+++ b/arch/x86/include/asm/page_64.h
@@ -3,4 +3,23 @@

#include <asm/page_64_types.h>

+#ifndef __ASSEMBLY__
+
+/* duplicated to the one in bootmem.h */
+extern unsigned long max_pfn;
+extern unsigned long phys_base;
+
+extern unsigned long __phys_addr(unsigned long);
+
+#define __phys_reloc_hide(x) (x)
+
+#ifdef CONFIG_FLATMEM
+#define pfn_valid(pfn) ((pfn) < max_pfn)
+#endif
+
+void clear_page(void *page);
+void copy_page(void *to, void *from);
+
+#endif /* !__ASSEMBLY__ */
+
#endif /* _ASM_X86_PAGE_64_H */
diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
index 320f7bb..8b491e6 100644
--- a/arch/x86/include/asm/page_64_types.h
+++ b/arch/x86/include/asm/page_64_types.h
@@ -50,26 +50,4 @@
#define KERNEL_IMAGE_SIZE (512 * 1024 * 1024)
#define KERNEL_IMAGE_START _AC(0xffffffff80000000, UL)

-#ifndef __ASSEMBLY__
-void clear_page(void *page);
-void copy_page(void *to, void *from);
-
-/* duplicated to the one in bootmem.h */
-extern unsigned long max_pfn;
-extern unsigned long phys_base;
-
-extern unsigned long __phys_addr(unsigned long);
-#define __phys_reloc_hide(x) (x)
-
-#define vmemmap ((struct page *)VMEMMAP_START)
-
-extern void init_extra_mapping_uc(unsigned long phys, unsigned long size);
-extern void init_extra_mapping_wb(unsigned long phys, unsigned long size);
-
-#endif /* !__ASSEMBLY__ */
-
-#ifdef CONFIG_FLATMEM
-#define pfn_valid(pfn) ((pfn) < max_pfn)
-#endif
-
#endif /* _ASM_X86_PAGE_64_DEFS_H */
diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h
index 47356f9..b5d30ad 100644
--- a/arch/x86/include/asm/pgtable_64.h
+++ b/arch/x86/include/asm/pgtable_64.h
@@ -183,6 +183,11 @@ extern void cleanup_highmap(void);

#define __HAVE_ARCH_PTE_SAME

+#define vmemmap ((struct page *)VMEMMAP_START)
+
+extern void init_extra_mapping_uc(unsigned long phys, unsigned long size);
+extern void init_extra_mapping_wb(unsigned long phys, unsigned long size);
+
#endif /* !__ASSEMBLY__ */

#endif /* _ASM_X86_PGTABLE_64_H */

2012-11-16 21:55:04

by Duyck, Alexander H

[permalink] [raw]
Subject: [PATCH v4 2/8] x86: Improve __phys_addr performance by making use of carry flags and inlining

This patch is meant to improve overall system performance when making use of
the __phys_addr call. To do this I have implemented several changes.

First if CONFIG_DEBUG_VIRTUAL is not defined __phys_addr is made an inline,
similar to how this is currently handled in 32 bit. However in order to do
this it is required to export phys_base so that it is available if __phys_addr
is used in kernel modules.

The second change was to streamline the code by making use of the carry flag
on an add operation instead of performing a compare on a 64 bit value. The
advantage to this is that it allows us to significantly reduce the overall
size of the call. On my Xeon E5 system the entire __phys_addr inline call
consumes a little less than 32 bytes and 5 instructions. I also applied
similar logic to the debug version of the function. My testing shows that the
debug version of the function with this patch applied is slightly faster than
the non-debug version without the patch.

Finally I also applied the same logic changes to __virt_addr_valid since it
used the same general code flow as __phys_addr and could achieve similar gains
though these changes.

Signed-off-by: Alexander Duyck <[email protected]>
---
v3: Added changes to __virt_addr_valid to keep it in sync with __phys_addr

arch/x86/include/asm/page_64.h | 14 +++++++++++++
arch/x86/kernel/x8664_ksyms_64.c | 3 +++
arch/x86/mm/physaddr.c | 40 ++++++++++++++++++++++++--------------
3 files changed, 42 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h
index 4150999..5138174 100644
--- a/arch/x86/include/asm/page_64.h
+++ b/arch/x86/include/asm/page_64.h
@@ -9,7 +9,21 @@
extern unsigned long max_pfn;
extern unsigned long phys_base;

+static inline unsigned long __phys_addr_nodebug(unsigned long x)
+{
+ unsigned long y = x - __START_KERNEL_map;
+
+ /* use the carry flag to determine if x was < __START_KERNEL_map */
+ x = y + ((x > y) ? phys_base : (__START_KERNEL_map - PAGE_OFFSET));
+
+ return x;
+}
+
+#ifdef CONFIG_DEBUG_VIRTUAL
extern unsigned long __phys_addr(unsigned long);
+#else
+#define __phys_addr(x) __phys_addr_nodebug(x)
+#endif

#define __phys_reloc_hide(x) (x)

diff --git a/arch/x86/kernel/x8664_ksyms_64.c b/arch/x86/kernel/x8664_ksyms_64.c
index 1330dd1..b014d94 100644
--- a/arch/x86/kernel/x8664_ksyms_64.c
+++ b/arch/x86/kernel/x8664_ksyms_64.c
@@ -59,6 +59,9 @@ EXPORT_SYMBOL(memcpy);
EXPORT_SYMBOL(__memcpy);
EXPORT_SYMBOL(memmove);

+#ifndef CONFIG_DEBUG_VIRTUAL
+EXPORT_SYMBOL(phys_base);
+#endif
EXPORT_SYMBOL(empty_zero_page);
#ifndef CONFIG_PARAVIRT
EXPORT_SYMBOL(native_load_gs_index);
diff --git a/arch/x86/mm/physaddr.c b/arch/x86/mm/physaddr.c
index d2e2735..fd40d75 100644
--- a/arch/x86/mm/physaddr.c
+++ b/arch/x86/mm/physaddr.c
@@ -8,33 +8,43 @@

#ifdef CONFIG_X86_64

+#ifdef CONFIG_DEBUG_VIRTUAL
unsigned long __phys_addr(unsigned long x)
{
- if (x >= __START_KERNEL_map) {
- x -= __START_KERNEL_map;
- VIRTUAL_BUG_ON(x >= KERNEL_IMAGE_SIZE);
- x += phys_base;
+ unsigned long y = x - __START_KERNEL_map;
+
+ /* use the carry flag to determine if x was < __START_KERNEL_map */
+ if (unlikely(x > y)) {
+ x = y + phys_base;
+
+ VIRTUAL_BUG_ON(y >= KERNEL_IMAGE_SIZE);
} else {
- VIRTUAL_BUG_ON(x < PAGE_OFFSET);
- x -= PAGE_OFFSET;
- VIRTUAL_BUG_ON(!phys_addr_valid(x));
+ x = y + (__START_KERNEL_map - PAGE_OFFSET);
+
+ /* carry flag will be set if starting x was >= PAGE_OFFSET */
+ VIRTUAL_BUG_ON((x > y) || !phys_addr_valid(x));
}
+
return x;
}
EXPORT_SYMBOL(__phys_addr);
+#endif

bool __virt_addr_valid(unsigned long x)
{
- if (x >= __START_KERNEL_map) {
- x -= __START_KERNEL_map;
- if (x >= KERNEL_IMAGE_SIZE)
+ unsigned long y = x - __START_KERNEL_map;
+
+ /* use the carry flag to determine if x was < __START_KERNEL_map */
+ if (unlikely(x > y)) {
+ x = y + phys_base;
+
+ if (y >= KERNEL_IMAGE_SIZE)
return false;
- x += phys_base;
} else {
- if (x < PAGE_OFFSET)
- return false;
- x -= PAGE_OFFSET;
- if (!phys_addr_valid(x))
+ x = y + (__START_KERNEL_map - PAGE_OFFSET);
+
+ /* carry flag will be set if starting x was >= PAGE_OFFSET */
+ if ((x > y) || !phys_addr_valid(x))
return false;
}

2012-11-16 21:56:58

by Duyck, Alexander H

[permalink] [raw]
Subject: [PATCH v4 3/8] x86: Make it so that __pa_symbol can only process kernel symbols on x86_64

I submitted an earlier patch that make __phys_addr an inline. This obviously
results in an increase in the code size. One step I can take to reduce that
is to make it so that the __pa_symbol call does a direct translation for
kernel addresses instead of covering all of virtual memory.

On my system this reduced the size for __pa_symbol from 5 instructions
totalling 30 bytes to 3 instructions totalling 16 bytes.

Signed-off-by: Alexander Duyck <[email protected]>
---
v4: Dropped y>x check in debug version of __phys_addr_symbol since we already
checked for y >= KERNEL_IMAGE_SIZE.

arch/x86/include/asm/page.h | 3 ++-
arch/x86/include/asm/page_32.h | 1 +
arch/x86/include/asm/page_64.h | 3 +++
arch/x86/mm/physaddr.c | 11 +++++++++++
4 files changed, 17 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/page.h b/arch/x86/include/asm/page.h
index 8ca8283..3698a6a 100644
--- a/arch/x86/include/asm/page.h
+++ b/arch/x86/include/asm/page.h
@@ -44,7 +44,8 @@ static inline void copy_user_page(void *to, void *from, unsigned long vaddr,
* case properly. Once all supported versions of gcc understand it, we can
* remove this Voodoo magic stuff. (i.e. once gcc3.x is deprecated)
*/
-#define __pa_symbol(x) __pa(__phys_reloc_hide((unsigned long)(x)))
+#define __pa_symbol(x) \
+ __phys_addr_symbol(__phys_reloc_hide((unsigned long)(x)))

#define __va(x) ((void *)((unsigned long)(x)+PAGE_OFFSET))

diff --git a/arch/x86/include/asm/page_32.h b/arch/x86/include/asm/page_32.h
index da4e762..4d550d0 100644
--- a/arch/x86/include/asm/page_32.h
+++ b/arch/x86/include/asm/page_32.h
@@ -15,6 +15,7 @@ extern unsigned long __phys_addr(unsigned long);
#else
#define __phys_addr(x) __phys_addr_nodebug(x)
#endif
+#define __phys_addr_symbol(x) __phys_addr(x)
#define __phys_reloc_hide(x) RELOC_HIDE((x), 0)

#ifdef CONFIG_FLATMEM
diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h
index 5138174..0f1ddee 100644
--- a/arch/x86/include/asm/page_64.h
+++ b/arch/x86/include/asm/page_64.h
@@ -21,8 +21,11 @@ static inline unsigned long __phys_addr_nodebug(unsigned long x)

#ifdef CONFIG_DEBUG_VIRTUAL
extern unsigned long __phys_addr(unsigned long);
+extern unsigned long __phys_addr_symbol(unsigned long);
#else
#define __phys_addr(x) __phys_addr_nodebug(x)
+#define __phys_addr_symbol(x) \
+ ((unsigned long)(x) - __START_KERNEL_map + phys_base)
#endif

#define __phys_reloc_hide(x) (x)
diff --git a/arch/x86/mm/physaddr.c b/arch/x86/mm/physaddr.c
index fd40d75..c73fedd 100644
--- a/arch/x86/mm/physaddr.c
+++ b/arch/x86/mm/physaddr.c
@@ -28,6 +28,17 @@ unsigned long __phys_addr(unsigned long x)
return x;
}
EXPORT_SYMBOL(__phys_addr);
+
+unsigned long __phys_addr_symbol(unsigned long x)
+{
+ unsigned long y = x - __START_KERNEL_map;
+
+ /* only check upper bounds since lower bounds will trigger carry */
+ VIRTUAL_BUG_ON(y >= KERNEL_IMAGE_SIZE);
+
+ return y + phys_base;
+}
+EXPORT_SYMBOL(__phys_addr_symbol);
#endif

bool __virt_addr_valid(unsigned long x)

2012-11-16 21:57:46

by Duyck, Alexander H

[permalink] [raw]
Subject: [PATCH v4 4/8] x86: Drop 4 unnecessary calls to __pa_symbol

While debugging the __pa_symbol inline patch I found that there were a couple
spots where __pa_symbol was used as follows:
__pa_symbol(x) - __pa_symbol(y)

The compiler had reduced them to:
x - y

Since we also support a debug case where __pa_symbol is a function call it
would probably be useful to just change the two cases I found so that they are
always just treated as "x - y". As such I am casting the values to
phys_addr_t and then doing simple subtraction so that the correct type and
value is returned.

Signed-off-by: Alexander Duyck <[email protected]>
---
arch/x86/kernel/head32.c | 4 ++--
arch/x86/kernel/head64.c | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/head32.c b/arch/x86/kernel/head32.c
index c18f59d..f15db0c 100644
--- a/arch/x86/kernel/head32.c
+++ b/arch/x86/kernel/head32.c
@@ -30,8 +30,8 @@ static void __init i386_default_early_setup(void)

void __init i386_start_kernel(void)
{
- memblock_reserve(__pa_symbol(&_text),
- __pa_symbol(&__bss_stop) - __pa_symbol(&_text));
+ memblock_reserve(__pa_symbol(_text),
+ (phys_addr_t)__bss_stop - (phys_addr_t)_text);

#ifdef CONFIG_BLK_DEV_INITRD
/* Reserve INITRD */
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 037df57..42f5df1 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -97,8 +97,8 @@ void __init x86_64_start_reservations(char *real_mode_data)
{
copy_bootdata(__va(real_mode_data));

- memblock_reserve(__pa_symbol(&_text),
- __pa_symbol(&__bss_stop) - __pa_symbol(&_text));
+ memblock_reserve(__pa_symbol(_text),
+ (phys_addr_t)__bss_stop - (phys_addr_t)_text);

#ifdef CONFIG_BLK_DEV_INITRD
/* Reserve INITRD */

2012-11-16 21:58:25

by Duyck, Alexander H

[permalink] [raw]
Subject: [PATCH v4 5/8] x86: Use __pa_symbol instead of __pa on C visible symbols

When I made an attempt at separating __pa_symbol and __pa I found that there
were a number of cases where __pa was used on an obvious symbol.

I also caught one non-obvious case as _brk_start and _brk_end are based on the
address of __brk_base which is a C visible symbol.

In mark_rodata_ro I was able to reduce the overhead of kernel symbol to
virtual memory translation by using a combination of __va(__pa_symbol())
instead of page_address(virt_to_page()).

Signed-off-by: Alexander Duyck <[email protected]>
---
v3: Added changes to init_64.c function mark_rodata_ro to avoid unnecessary
conversion to and from a page when all that is wanted is a virtual
address.

arch/x86/kernel/cpu/intel.c | 2 +-
arch/x86/kernel/setup.c | 16 ++++++++--------
arch/x86/mm/init_64.c | 18 ++++++++----------
arch/x86/mm/pageattr.c | 8 ++++----
arch/x86/platform/efi/efi.c | 4 ++--
arch/x86/realmode/init.c | 8 ++++----
6 files changed, 27 insertions(+), 29 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 198e019..2249e7e 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -168,7 +168,7 @@ int __cpuinit ppro_with_ram_bug(void)
#ifdef CONFIG_X86_F00F_BUG
static void __cpuinit trap_init_f00f_bug(void)
{
- __set_fixmap(FIX_F00F_IDT, __pa(&idt_table), PAGE_KERNEL_RO);
+ __set_fixmap(FIX_F00F_IDT, __pa_symbol(idt_table), PAGE_KERNEL_RO);

/*
* Update the IDT descriptor and reload the IDT so that
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index ca45696..2702c5d 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -300,8 +300,8 @@ static void __init cleanup_highmap(void)
static void __init reserve_brk(void)
{
if (_brk_end > _brk_start)
- memblock_reserve(__pa(_brk_start),
- __pa(_brk_end) - __pa(_brk_start));
+ memblock_reserve(__pa_symbol(_brk_start),
+ _brk_end - _brk_start);

/* Mark brk area as locked down and no longer taking any
new allocations */
@@ -761,12 +761,12 @@ void __init setup_arch(char **cmdline_p)
init_mm.end_data = (unsigned long) _edata;
init_mm.brk = _brk_end;

- code_resource.start = virt_to_phys(_text);
- code_resource.end = virt_to_phys(_etext)-1;
- data_resource.start = virt_to_phys(_etext);
- data_resource.end = virt_to_phys(_edata)-1;
- bss_resource.start = virt_to_phys(&__bss_start);
- bss_resource.end = virt_to_phys(&__bss_stop)-1;
+ code_resource.start = __pa_symbol(_text);
+ code_resource.end = __pa_symbol(_etext)-1;
+ data_resource.start = __pa_symbol(_etext);
+ data_resource.end = __pa_symbol(_edata)-1;
+ bss_resource.start = __pa_symbol(__bss_start);
+ bss_resource.end = __pa_symbol(__bss_stop)-1;

#ifdef CONFIG_CMDLINE_BOOL
#ifdef CONFIG_CMDLINE_OVERRIDE
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 3baff25..0374a10 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -770,12 +770,10 @@ void set_kernel_text_ro(void)
void mark_rodata_ro(void)
{
unsigned long start = PFN_ALIGN(_text);
- unsigned long rodata_start =
- ((unsigned long)__start_rodata + PAGE_SIZE - 1) & PAGE_MASK;
+ unsigned long rodata_start = PFN_ALIGN(__start_rodata);
unsigned long end = (unsigned long) &__end_rodata_hpage_align;
- unsigned long text_end = PAGE_ALIGN((unsigned long) &__stop___ex_table);
- unsigned long rodata_end = PAGE_ALIGN((unsigned long) &__end_rodata);
- unsigned long data_start = (unsigned long) &_sdata;
+ unsigned long text_end = PFN_ALIGN(&__stop___ex_table);
+ unsigned long rodata_end = PFN_ALIGN(&__end_rodata);

printk(KERN_INFO "Write protecting the kernel read-only data: %luk\n",
(end - start) >> 10);
@@ -800,12 +798,12 @@ void mark_rodata_ro(void)
#endif

free_init_pages("unused kernel memory",
- (unsigned long) page_address(virt_to_page(text_end)),
- (unsigned long)
- page_address(virt_to_page(rodata_start)));
+ (unsigned long) __va(__pa_symbol(text_end)),
+ (unsigned long) __va(__pa_symbol(rodata_start)));
+
free_init_pages("unused kernel memory",
- (unsigned long) page_address(virt_to_page(rodata_end)),
- (unsigned long) page_address(virt_to_page(data_start)));
+ (unsigned long) __va(__pa_symbol(rodata_end)),
+ (unsigned long) __va(__pa_symbol(_sdata)));
}

#endif
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index a718e0d..40f92f3 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -94,12 +94,12 @@ static inline void split_page_count(int level) { }

static inline unsigned long highmap_start_pfn(void)
{
- return __pa(_text) >> PAGE_SHIFT;
+ return __pa_symbol(_text) >> PAGE_SHIFT;
}

static inline unsigned long highmap_end_pfn(void)
{
- return __pa(roundup(_brk_end, PMD_SIZE)) >> PAGE_SHIFT;
+ return __pa_symbol(roundup(_brk_end, PMD_SIZE)) >> PAGE_SHIFT;
}

#endif
@@ -276,8 +276,8 @@ static inline pgprot_t static_protections(pgprot_t prot, unsigned long address,
* The .rodata section needs to be read-only. Using the pfn
* catches all aliases.
*/
- if (within(pfn, __pa((unsigned long)__start_rodata) >> PAGE_SHIFT,
- __pa((unsigned long)__end_rodata) >> PAGE_SHIFT))
+ if (within(pfn, __pa_symbol(__start_rodata) >> PAGE_SHIFT,
+ __pa_symbol(__end_rodata) >> PAGE_SHIFT))
pgprot_val(forbidden) |= _PAGE_RW;

#if defined(CONFIG_X86_64) && defined(CONFIG_DEBUG_RODATA)
diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index ad44391..1b60026 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -410,8 +410,8 @@ void __init efi_reserve_boot_services(void)
* - Not within any part of the kernel
* - Not the bios reserved area
*/
- if ((start+size >= virt_to_phys(_text)
- && start <= virt_to_phys(_end)) ||
+ if ((start+size >= __pa_symbol(_text)
+ && start <= __pa_symbol(_end)) ||
!e820_all_mapped(start, start+size, E820_RAM) ||
memblock_is_region_reserved(start, size)) {
/* Could not reserve, skip it */
diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
index cbca565..8045026 100644
--- a/arch/x86/realmode/init.c
+++ b/arch/x86/realmode/init.c
@@ -62,9 +62,9 @@ void __init setup_real_mode(void)
__va(real_mode_header->trampoline_header);

#ifdef CONFIG_X86_32
- trampoline_header->start = __pa(startup_32_smp);
+ trampoline_header->start = __pa_symbol(startup_32_smp);
trampoline_header->gdt_limit = __BOOT_DS + 7;
- trampoline_header->gdt_base = __pa(boot_gdt);
+ trampoline_header->gdt_base = __pa_symbol(boot_gdt);
#else
/*
* Some AMD processors will #GP(0) if EFER.LMA is set in WRMSR
@@ -78,8 +78,8 @@ void __init setup_real_mode(void)
*trampoline_cr4_features = read_cr4();

trampoline_pgd = (u64 *) __va(real_mode_header->trampoline_pgd);
- trampoline_pgd[0] = __pa(level3_ident_pgt) + _KERNPG_TABLE;
- trampoline_pgd[511] = __pa(level3_kernel_pgt) + _KERNPG_TABLE;
+ trampoline_pgd[0] = __pa_symbol(level3_ident_pgt) + _KERNPG_TABLE;
+ trampoline_pgd[511] = __pa_symbol(level3_kernel_pgt) + _KERNPG_TABLE;
#endif
}

2012-11-16 21:58:43

by Duyck, Alexander H

[permalink] [raw]
Subject: [PATCH v4 6/8] x86/ftrace: Use __pa_symbol instead of __pa on C visible symbols

Instead of using __pa which is meant to be a general function for converting
virtual addresses to physical addresses we can use __pa_symbol which is the
preferred way of decoding kernel text virtual addresses to physical addresses.

In this case we are not directly converting C visible symbols however if we
know that the instruction pointer is somewhere between _text and _etext we
know that we are going to be translating an address form the kernel text
space.

Cc: Steven Rostedt <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Signed-off-by: Alexander Duyck <[email protected]>
---
arch/x86/kernel/ftrace.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 1d41402..42a392a 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -89,7 +89,7 @@ do_ftrace_mod_code(unsigned long ip, const void *new_code)
* kernel identity mapping to modify code.
*/
if (within(ip, (unsigned long)_text, (unsigned long)_etext))
- ip = (unsigned long)__va(__pa(ip));
+ ip = (unsigned long)__va(__pa_symbol(ip));

return probe_kernel_write((void *)ip, new_code, MCOUNT_INSN_SIZE);
}
@@ -279,7 +279,7 @@ static int ftrace_write(unsigned long ip, const char *val, int size)
* kernel identity mapping to modify code.
*/
if (within(ip, (unsigned long)_text, (unsigned long)_etext))
- ip = (unsigned long)__va(__pa(ip));
+ ip = (unsigned long)__va(__pa_symbol(ip));

return probe_kernel_write((void *)ip, val, size);
}

2012-11-16 21:58:55

by Duyck, Alexander H

[permalink] [raw]
Subject: [PATCH v4 7/8] x86/acpi: Use __pa_symbol instead of __pa on C visible symbols

This change just updates one spot where __pa was being used when __pa_symbol
should have been used. By using __pa_symbol we are able to drop a few extra
lines of code as we don't have to test to see if the virtual pointer is a
part of the kernel text or just standard virtual memory.

Cc: Len Brown <[email protected]>
Cc: Pavel Machek <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Signed-off-by: Alexander Duyck <[email protected]>
---
arch/x86/kernel/acpi/sleep.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/acpi/sleep.c b/arch/x86/kernel/acpi/sleep.c
index 11676cf..f146a3c 100644
--- a/arch/x86/kernel/acpi/sleep.c
+++ b/arch/x86/kernel/acpi/sleep.c
@@ -69,7 +69,7 @@ int acpi_suspend_lowlevel(void)

#ifndef CONFIG_64BIT
header->pmode_entry = (u32)&wakeup_pmode_return;
- header->pmode_cr3 = (u32)__pa(&initial_page_table);
+ header->pmode_cr3 = (u32)__pa_symbol(initial_page_table);
saved_magic = 0x12345678;
#else /* CONFIG_64BIT */
#ifdef CONFIG_SMP

2012-11-16 21:59:23

by Duyck, Alexander H

[permalink] [raw]
Subject: [PATCH v4 8/8] x86/lguest: Use __pa_symbol instead of __pa on C visible symbols

The function lguest_write_cr3 is using __pa to convert swapper_pg_dir and
initial_page_table from virtual addresses to physical. The correct function
to use for these values is __pa_symbol since they are C visible symbols.

Cc: Rusty Russell <[email protected]>
Signed-off-by: Alexander Duyck <[email protected]>
---
arch/x86/lguest/boot.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/arch/x86/lguest/boot.c b/arch/x86/lguest/boot.c
index 642d880..139dd35 100644
--- a/arch/x86/lguest/boot.c
+++ b/arch/x86/lguest/boot.c
@@ -552,7 +552,8 @@ static void lguest_write_cr3(unsigned long cr3)
current_cr3 = cr3;

/* These two page tables are simple, linear, and used during boot */
- if (cr3 != __pa(swapper_pg_dir) && cr3 != __pa(initial_page_table))
+ if (cr3 != __pa_symbol(swapper_pg_dir) &&
+ cr3 != __pa_symbol(initial_page_table))
cr3_changed = true;
}

2012-11-16 22:02:39

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH v4 7/8] x86/acpi: Use __pa_symbol instead of __pa on C visible symbols

On Fri 2012-11-16 13:57:43, Alexander Duyck wrote:
> This change just updates one spot where __pa was being used when __pa_symbol
> should have been used. By using __pa_symbol we are able to drop a few extra
> lines of code as we don't have to test to see if the virtual pointer is a
> part of the kernel text or just standard virtual memory.
>
> Cc: Len Brown <[email protected]>
> Cc: Pavel Machek <[email protected]>

ACK.

2012-11-16 22:21:00

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH v4 6/8] x86/ftrace: Use __pa_symbol instead of __pa on C visible symbols

On Fri, 2012-11-16 at 13:57 -0800, Alexander Duyck wrote:
> Instead of using __pa which is meant to be a general function for converting
> virtual addresses to physical addresses we can use __pa_symbol which is the
> preferred way of decoding kernel text virtual addresses to physical addresses.
>
> In this case we are not directly converting C visible symbols however if we
> know that the instruction pointer is somewhere between _text and _etext we
> know that we are going to be translating an address form the kernel text
> space.

Can I ask what the purpose of this is? I'm a little skeptical of a
change that is just "the preferred way".

Was there some bug that this fixes?

-- Steve

>
> Cc: Steven Rostedt <[email protected]>
> Cc: Frederic Weisbecker <[email protected]>
> Signed-off-by: Alexander Duyck <[email protected]>
> ---
> arch/x86/kernel/ftrace.c | 4 ++--
> 1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
> index 1d41402..42a392a 100644
> --- a/arch/x86/kernel/ftrace.c
> +++ b/arch/x86/kernel/ftrace.c
> @@ -89,7 +89,7 @@ do_ftrace_mod_code(unsigned long ip, const void *new_code)
> * kernel identity mapping to modify code.
> */
> if (within(ip, (unsigned long)_text, (unsigned long)_etext))
> - ip = (unsigned long)__va(__pa(ip));
> + ip = (unsigned long)__va(__pa_symbol(ip));
>
> return probe_kernel_write((void *)ip, new_code, MCOUNT_INSN_SIZE);
> }
> @@ -279,7 +279,7 @@ static int ftrace_write(unsigned long ip, const char *val, int size)
> * kernel identity mapping to modify code.
> */
> if (within(ip, (unsigned long)_text, (unsigned long)_etext))
> - ip = (unsigned long)__va(__pa(ip));
> + ip = (unsigned long)__va(__pa_symbol(ip));
>
> return probe_kernel_write((void *)ip, val, size);
> }

2012-11-16 22:37:20

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH v4 6/8] x86/ftrace: Use __pa_symbol instead of __pa on C visible symbols

It is a performance improvement.

Steven Rostedt <[email protected]> wrote:

>On Fri, 2012-11-16 at 13:57 -0800, Alexander Duyck wrote:
>> Instead of using __pa which is meant to be a general function for
>converting
>> virtual addresses to physical addresses we can use __pa_symbol which
>is the
>> preferred way of decoding kernel text virtual addresses to physical
>addresses.
>>
>> In this case we are not directly converting C visible symbols however
>if we
>> know that the instruction pointer is somewhere between _text and
>_etext we
>> know that we are going to be translating an address form the kernel
>text
>> space.
>
>Can I ask what the purpose of this is? I'm a little skeptical of a
>change that is just "the preferred way".
>
>Was there some bug that this fixes?
>
>-- Steve
>
>>
>> Cc: Steven Rostedt <[email protected]>
>> Cc: Frederic Weisbecker <[email protected]>
>> Signed-off-by: Alexander Duyck <[email protected]>
>> ---
>> arch/x86/kernel/ftrace.c | 4 ++--
>> 1 files changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
>> index 1d41402..42a392a 100644
>> --- a/arch/x86/kernel/ftrace.c
>> +++ b/arch/x86/kernel/ftrace.c
>> @@ -89,7 +89,7 @@ do_ftrace_mod_code(unsigned long ip, const void
>*new_code)
>> * kernel identity mapping to modify code.
>> */
>> if (within(ip, (unsigned long)_text, (unsigned long)_etext))
>> - ip = (unsigned long)__va(__pa(ip));
>> + ip = (unsigned long)__va(__pa_symbol(ip));
>>
>> return probe_kernel_write((void *)ip, new_code, MCOUNT_INSN_SIZE);
>> }
>> @@ -279,7 +279,7 @@ static int ftrace_write(unsigned long ip, const
>char *val, int size)
>> * kernel identity mapping to modify code.
>> */
>> if (within(ip, (unsigned long)_text, (unsigned long)_etext))
>> - ip = (unsigned long)__va(__pa(ip));
>> + ip = (unsigned long)__va(__pa_symbol(ip));
>>
>> return probe_kernel_write((void *)ip, val, size);
>> }

--
Sent from my mobile phone. Please excuse brevity and lack of formatting.

2012-11-16 22:45:28

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH v4 6/8] x86/ftrace: Use __pa_symbol instead of __pa on C visible symbols

On Fri, 2012-11-16 at 14:25 -0800, H. Peter Anvin wrote:
> It is a performance improvement.
>
> >> * kernel identity mapping to modify code.
> >> */
> >> if (within(ip, (unsigned long)_text, (unsigned long)_etext))
> >> - ip = (unsigned long)__va(__pa(ip));
> >> + ip = (unsigned long)__va(__pa_symbol(ip));
> >>
> >> return probe_kernel_write((void *)ip, new_code, MCOUNT_INSN_SIZE);
> >> }
> >> @@ -279,7 +279,7 @@ static int ftrace_write(unsigned long ip, const
> >char *val, int size)
> >> * kernel identity mapping to modify code.
> >> */
> >> if (within(ip, (unsigned long)_text, (unsigned long)_etext))
> >> - ip = (unsigned long)__va(__pa(ip));
> >> + ip = (unsigned long)__va(__pa_symbol(ip));
> >>
> >> return probe_kernel_write((void *)ip, val, size);
> >> }
>

#define __pa(x) __phys_addr((unsigned long)(x))
#define __pa_symbol(x) __pa(__phys_reloc_hide((unsigned long)(x)))


I'm confused. __pa_symbol() just calls __pa() with some macro magic to
its parameter. How is this a performance improvement?

-- Steve

2012-11-16 23:06:21

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH v4 6/8] x86/ftrace: Use __pa_symbol instead of __pa on C visible symbols

On 11/16/2012 02:45 PM, Steven Rostedt wrote:
>
> #define __pa(x) __phys_addr((unsigned long)(x))
> #define __pa_symbol(x) __pa(__phys_reloc_hide((unsigned long)(x)))
>
> I'm confused. __pa_symbol() just calls __pa() with some macro magic to
> its parameter. How is this a performance improvement?
>

One of the earlier patches in this series changes __pa_symbol() to avoid
the conditional hidden inside __phys_addr(), since by definition a
symbol can only be on one side of that branch.

-hpa

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.

2012-11-16 23:20:08

by Duyck, Alexander H

[permalink] [raw]
Subject: Re: [PATCH v4 6/8] x86/ftrace: Use __pa_symbol instead of __pa on C visible symbols

On 11/16/2012 03:06 PM, H. Peter Anvin wrote:
> On 11/16/2012 02:45 PM, Steven Rostedt wrote:
>>
>> #define __pa(x) __phys_addr((unsigned long)(x))
>> #define __pa_symbol(x) __pa(__phys_reloc_hide((unsigned long)(x)))
>>
>> I'm confused. __pa_symbol() just calls __pa() with some macro magic to
>> its parameter. How is this a performance improvement?
>>
>
> One of the earlier patches in this series changes __pa_symbol() to
> avoid the conditional hidden inside __phys_addr(), since by definition
> a symbol can only be on one side of that branch.
>
> -hpa
>

In addition to being a bit faster the code is also a bit smaller since
it can combine the the constants from __va() and __pa_symbol() as the
new __pa_symbol is an inline in the non-debug case.

Thanks,

Alex

2012-11-16 23:30:33

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH v4 6/8] x86/ftrace: Use __pa_symbol instead of __pa on C visible symbols

On Fri, 2012-11-16 at 15:06 -0800, H. Peter Anvin wrote:
> On 11/16/2012 02:45 PM, Steven Rostedt wrote:
> >
> > #define __pa(x) __phys_addr((unsigned long)(x))
> > #define __pa_symbol(x) __pa(__phys_reloc_hide((unsigned long)(x)))
> >
> > I'm confused. __pa_symbol() just calls __pa() with some macro magic to
> > its parameter. How is this a performance improvement?
> >
>
> One of the earlier patches in this series changes __pa_symbol() to avoid
> the conditional hidden inside __phys_addr(), since by definition a
> symbol can only be on one side of that branch.
>

Ah I missed the 6/8 reference and haven't looked at the rest of the
series.

OK, this makes sense now.

-- Steve

2012-11-17 00:23:36

by Duyck, Alexander H

[permalink] [raw]
Subject: [tip:x86/mm] x86: Move some contents of page_64_types.h into pgtable_64.h and page_64.h

Commit-ID: 66d61384e9b4087f044ce86cb4adb12fe4623a6b
Gitweb: http://git.kernel.org/tip/66d61384e9b4087f044ce86cb4adb12fe4623a6b
Author: Alexander Duyck <[email protected]>
AuthorDate: Fri, 16 Nov 2012 13:53:09 -0800
Committer: H. Peter Anvin <[email protected]>
CommitDate: Fri, 16 Nov 2012 15:20:10 -0800

x86: Move some contents of page_64_types.h into pgtable_64.h and page_64.h

This patch is meant to clean-up the fact that we have several functions in
page_64_types.h which really don't belong there. I found this issue when I
had tried to replace __phys_addr with an inline function. It resulted in the
realmode bits generating compile warnings about types. In order to resolve
that I am relocating the address translation to page_64.h since this is in
keeping with where these functions are located in 32 bit.

In addtion I have relocated several functions defined in init_64.c to
pgtable_64.h as this seems to be where most of the functions related to
memory initialization were already located.

Signed-off-by: Alexander Duyck <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/include/asm/page_64.h | 19 +++++++++++++++++++
arch/x86/include/asm/page_64_types.h | 22 ----------------------
arch/x86/include/asm/pgtable_64.h | 5 +++++
3 files changed, 24 insertions(+), 22 deletions(-)

diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h
index 072694e..4150999 100644
--- a/arch/x86/include/asm/page_64.h
+++ b/arch/x86/include/asm/page_64.h
@@ -3,4 +3,23 @@

#include <asm/page_64_types.h>

+#ifndef __ASSEMBLY__
+
+/* duplicated to the one in bootmem.h */
+extern unsigned long max_pfn;
+extern unsigned long phys_base;
+
+extern unsigned long __phys_addr(unsigned long);
+
+#define __phys_reloc_hide(x) (x)
+
+#ifdef CONFIG_FLATMEM
+#define pfn_valid(pfn) ((pfn) < max_pfn)
+#endif
+
+void clear_page(void *page);
+void copy_page(void *to, void *from);
+
+#endif /* !__ASSEMBLY__ */
+
#endif /* _ASM_X86_PAGE_64_H */
diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
index 320f7bb..8b491e6 100644
--- a/arch/x86/include/asm/page_64_types.h
+++ b/arch/x86/include/asm/page_64_types.h
@@ -50,26 +50,4 @@
#define KERNEL_IMAGE_SIZE (512 * 1024 * 1024)
#define KERNEL_IMAGE_START _AC(0xffffffff80000000, UL)

-#ifndef __ASSEMBLY__
-void clear_page(void *page);
-void copy_page(void *to, void *from);
-
-/* duplicated to the one in bootmem.h */
-extern unsigned long max_pfn;
-extern unsigned long phys_base;
-
-extern unsigned long __phys_addr(unsigned long);
-#define __phys_reloc_hide(x) (x)
-
-#define vmemmap ((struct page *)VMEMMAP_START)
-
-extern void init_extra_mapping_uc(unsigned long phys, unsigned long size);
-extern void init_extra_mapping_wb(unsigned long phys, unsigned long size);
-
-#endif /* !__ASSEMBLY__ */
-
-#ifdef CONFIG_FLATMEM
-#define pfn_valid(pfn) ((pfn) < max_pfn)
-#endif
-
#endif /* _ASM_X86_PAGE_64_DEFS_H */
diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h
index 47356f9..b5d30ad 100644
--- a/arch/x86/include/asm/pgtable_64.h
+++ b/arch/x86/include/asm/pgtable_64.h
@@ -183,6 +183,11 @@ extern void cleanup_highmap(void);

#define __HAVE_ARCH_PTE_SAME

+#define vmemmap ((struct page *)VMEMMAP_START)
+
+extern void init_extra_mapping_uc(unsigned long phys, unsigned long size);
+extern void init_extra_mapping_wb(unsigned long phys, unsigned long size);
+
#endif /* !__ASSEMBLY__ */

#endif /* _ASM_X86_PGTABLE_64_H */

2012-11-17 00:24:20

by Duyck, Alexander H

[permalink] [raw]
Subject: [tip:x86/mm] x86: Improve __phys_addr performance by making use of carry flags and inlining

Commit-ID: 9dfc3fc0305e158f44c9bb05e9fa13be6125cc4b
Gitweb: http://git.kernel.org/tip/9dfc3fc0305e158f44c9bb05e9fa13be6125cc4b
Author: Alexander Duyck <[email protected]>
AuthorDate: Fri, 16 Nov 2012 13:53:51 -0800
Committer: H. Peter Anvin <[email protected]>
CommitDate: Fri, 16 Nov 2012 15:20:20 -0800

x86: Improve __phys_addr performance by making use of carry flags and inlining

This patch is meant to improve overall system performance when making use of
the __phys_addr call. To do this I have implemented several changes.

First if CONFIG_DEBUG_VIRTUAL is not defined __phys_addr is made an inline,
similar to how this is currently handled in 32 bit. However in order to do
this it is required to export phys_base so that it is available if __phys_addr
is used in kernel modules.

The second change was to streamline the code by making use of the carry flag
on an add operation instead of performing a compare on a 64 bit value. The
advantage to this is that it allows us to significantly reduce the overall
size of the call. On my Xeon E5 system the entire __phys_addr inline call
consumes a little less than 32 bytes and 5 instructions. I also applied
similar logic to the debug version of the function. My testing shows that the
debug version of the function with this patch applied is slightly faster than
the non-debug version without the patch.

Finally I also applied the same logic changes to __virt_addr_valid since it
used the same general code flow as __phys_addr and could achieve similar gains
though these changes.

Signed-off-by: Alexander Duyck <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/include/asm/page_64.h | 14 ++++++++++++++
arch/x86/kernel/x8664_ksyms_64.c | 3 +++
arch/x86/mm/physaddr.c | 40 +++++++++++++++++++++++++---------------
3 files changed, 42 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h
index 4150999..5138174 100644
--- a/arch/x86/include/asm/page_64.h
+++ b/arch/x86/include/asm/page_64.h
@@ -9,7 +9,21 @@
extern unsigned long max_pfn;
extern unsigned long phys_base;

+static inline unsigned long __phys_addr_nodebug(unsigned long x)
+{
+ unsigned long y = x - __START_KERNEL_map;
+
+ /* use the carry flag to determine if x was < __START_KERNEL_map */
+ x = y + ((x > y) ? phys_base : (__START_KERNEL_map - PAGE_OFFSET));
+
+ return x;
+}
+
+#ifdef CONFIG_DEBUG_VIRTUAL
extern unsigned long __phys_addr(unsigned long);
+#else
+#define __phys_addr(x) __phys_addr_nodebug(x)
+#endif

#define __phys_reloc_hide(x) (x)

diff --git a/arch/x86/kernel/x8664_ksyms_64.c b/arch/x86/kernel/x8664_ksyms_64.c
index 1330dd1..b014d94 100644
--- a/arch/x86/kernel/x8664_ksyms_64.c
+++ b/arch/x86/kernel/x8664_ksyms_64.c
@@ -59,6 +59,9 @@ EXPORT_SYMBOL(memcpy);
EXPORT_SYMBOL(__memcpy);
EXPORT_SYMBOL(memmove);

+#ifndef CONFIG_DEBUG_VIRTUAL
+EXPORT_SYMBOL(phys_base);
+#endif
EXPORT_SYMBOL(empty_zero_page);
#ifndef CONFIG_PARAVIRT
EXPORT_SYMBOL(native_load_gs_index);
diff --git a/arch/x86/mm/physaddr.c b/arch/x86/mm/physaddr.c
index d2e2735..fd40d75 100644
--- a/arch/x86/mm/physaddr.c
+++ b/arch/x86/mm/physaddr.c
@@ -8,33 +8,43 @@

#ifdef CONFIG_X86_64

+#ifdef CONFIG_DEBUG_VIRTUAL
unsigned long __phys_addr(unsigned long x)
{
- if (x >= __START_KERNEL_map) {
- x -= __START_KERNEL_map;
- VIRTUAL_BUG_ON(x >= KERNEL_IMAGE_SIZE);
- x += phys_base;
+ unsigned long y = x - __START_KERNEL_map;
+
+ /* use the carry flag to determine if x was < __START_KERNEL_map */
+ if (unlikely(x > y)) {
+ x = y + phys_base;
+
+ VIRTUAL_BUG_ON(y >= KERNEL_IMAGE_SIZE);
} else {
- VIRTUAL_BUG_ON(x < PAGE_OFFSET);
- x -= PAGE_OFFSET;
- VIRTUAL_BUG_ON(!phys_addr_valid(x));
+ x = y + (__START_KERNEL_map - PAGE_OFFSET);
+
+ /* carry flag will be set if starting x was >= PAGE_OFFSET */
+ VIRTUAL_BUG_ON((x > y) || !phys_addr_valid(x));
}
+
return x;
}
EXPORT_SYMBOL(__phys_addr);
+#endif

bool __virt_addr_valid(unsigned long x)
{
- if (x >= __START_KERNEL_map) {
- x -= __START_KERNEL_map;
- if (x >= KERNEL_IMAGE_SIZE)
+ unsigned long y = x - __START_KERNEL_map;
+
+ /* use the carry flag to determine if x was < __START_KERNEL_map */
+ if (unlikely(x > y)) {
+ x = y + phys_base;
+
+ if (y >= KERNEL_IMAGE_SIZE)
return false;
- x += phys_base;
} else {
- if (x < PAGE_OFFSET)
- return false;
- x -= PAGE_OFFSET;
- if (!phys_addr_valid(x))
+ x = y + (__START_KERNEL_map - PAGE_OFFSET);
+
+ /* carry flag will be set if starting x was >= PAGE_OFFSET */
+ if ((x > y) || !phys_addr_valid(x))
return false;
}

2012-11-17 00:25:25

by Duyck, Alexander H

[permalink] [raw]
Subject: [tip:x86/mm] x86: Make it so that __pa_symbol can only process kernel symbols on x86_64

Commit-ID: bf4010fcf8f241a81693556b5abb48d9dbc33f97
Gitweb: http://git.kernel.org/tip/bf4010fcf8f241a81693556b5abb48d9dbc33f97
Author: Alexander Duyck <[email protected]>
AuthorDate: Fri, 16 Nov 2012 13:55:46 -0800
Committer: H. Peter Anvin <[email protected]>
CommitDate: Fri, 16 Nov 2012 15:20:25 -0800

x86: Make it so that __pa_symbol can only process kernel symbols on x86_64

I submitted an earlier patch that make __phys_addr an inline. This obviously
results in an increase in the code size. One step I can take to reduce that
is to make it so that the __pa_symbol call does a direct translation for
kernel addresses instead of covering all of virtual memory.

On my system this reduced the size for __pa_symbol from 5 instructions
totalling 30 bytes to 3 instructions totalling 16 bytes.

Signed-off-by: Alexander Duyck <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/include/asm/page.h | 3 ++-
arch/x86/include/asm/page_32.h | 1 +
arch/x86/include/asm/page_64.h | 3 +++
arch/x86/mm/physaddr.c | 11 +++++++++++
4 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/page.h b/arch/x86/include/asm/page.h
index 8ca8283..3698a6a 100644
--- a/arch/x86/include/asm/page.h
+++ b/arch/x86/include/asm/page.h
@@ -44,7 +44,8 @@ static inline void copy_user_page(void *to, void *from, unsigned long vaddr,
* case properly. Once all supported versions of gcc understand it, we can
* remove this Voodoo magic stuff. (i.e. once gcc3.x is deprecated)
*/
-#define __pa_symbol(x) __pa(__phys_reloc_hide((unsigned long)(x)))
+#define __pa_symbol(x) \
+ __phys_addr_symbol(__phys_reloc_hide((unsigned long)(x)))

#define __va(x) ((void *)((unsigned long)(x)+PAGE_OFFSET))

diff --git a/arch/x86/include/asm/page_32.h b/arch/x86/include/asm/page_32.h
index da4e762..4d550d0 100644
--- a/arch/x86/include/asm/page_32.h
+++ b/arch/x86/include/asm/page_32.h
@@ -15,6 +15,7 @@ extern unsigned long __phys_addr(unsigned long);
#else
#define __phys_addr(x) __phys_addr_nodebug(x)
#endif
+#define __phys_addr_symbol(x) __phys_addr(x)
#define __phys_reloc_hide(x) RELOC_HIDE((x), 0)

#ifdef CONFIG_FLATMEM
diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h
index 5138174..0f1ddee 100644
--- a/arch/x86/include/asm/page_64.h
+++ b/arch/x86/include/asm/page_64.h
@@ -21,8 +21,11 @@ static inline unsigned long __phys_addr_nodebug(unsigned long x)

#ifdef CONFIG_DEBUG_VIRTUAL
extern unsigned long __phys_addr(unsigned long);
+extern unsigned long __phys_addr_symbol(unsigned long);
#else
#define __phys_addr(x) __phys_addr_nodebug(x)
+#define __phys_addr_symbol(x) \
+ ((unsigned long)(x) - __START_KERNEL_map + phys_base)
#endif

#define __phys_reloc_hide(x) (x)
diff --git a/arch/x86/mm/physaddr.c b/arch/x86/mm/physaddr.c
index fd40d75..c73fedd 100644
--- a/arch/x86/mm/physaddr.c
+++ b/arch/x86/mm/physaddr.c
@@ -28,6 +28,17 @@ unsigned long __phys_addr(unsigned long x)
return x;
}
EXPORT_SYMBOL(__phys_addr);
+
+unsigned long __phys_addr_symbol(unsigned long x)
+{
+ unsigned long y = x - __START_KERNEL_map;
+
+ /* only check upper bounds since lower bounds will trigger carry */
+ VIRTUAL_BUG_ON(y >= KERNEL_IMAGE_SIZE);
+
+ return y + phys_base;
+}
+EXPORT_SYMBOL(__phys_addr_symbol);
#endif

bool __virt_addr_valid(unsigned long x)

2012-11-17 00:26:47

by Duyck, Alexander H

[permalink] [raw]
Subject: [tip:x86/mm] x86: Drop 4 unnecessary calls to __pa_symbol

Commit-ID: 409fa8dcd2c8b1ec1e705ad99b152bce665af025
Gitweb: http://git.kernel.org/tip/409fa8dcd2c8b1ec1e705ad99b152bce665af025
Author: Alexander Duyck <[email protected]>
AuthorDate: Fri, 16 Nov 2012 13:56:35 -0800
Committer: H. Peter Anvin <[email protected]>
CommitDate: Fri, 16 Nov 2012 15:20:28 -0800

x86: Drop 4 unnecessary calls to __pa_symbol

While debugging the __pa_symbol inline patch I found that there were a couple
spots where __pa_symbol was used as follows:
__pa_symbol(x) - __pa_symbol(y)

The compiler had reduced them to:
x - y

Since we also support a debug case where __pa_symbol is a function call it
would probably be useful to just change the two cases I found so that they are
always just treated as "x - y". As such I am casting the values to
phys_addr_t and then doing simple subtraction so that the correct type and
value is returned.

Signed-off-by: Alexander Duyck <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/kernel/head32.c | 4 ++--
arch/x86/kernel/head64.c | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/head32.c b/arch/x86/kernel/head32.c
index c18f59d..f15db0c 100644
--- a/arch/x86/kernel/head32.c
+++ b/arch/x86/kernel/head32.c
@@ -30,8 +30,8 @@ static void __init i386_default_early_setup(void)

void __init i386_start_kernel(void)
{
- memblock_reserve(__pa_symbol(&_text),
- __pa_symbol(&__bss_stop) - __pa_symbol(&_text));
+ memblock_reserve(__pa_symbol(_text),
+ (phys_addr_t)__bss_stop - (phys_addr_t)_text);

#ifdef CONFIG_BLK_DEV_INITRD
/* Reserve INITRD */
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 037df57..42f5df1 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -97,8 +97,8 @@ void __init x86_64_start_reservations(char *real_mode_data)
{
copy_bootdata(__va(real_mode_data));

- memblock_reserve(__pa_symbol(&_text),
- __pa_symbol(&__bss_stop) - __pa_symbol(&_text));
+ memblock_reserve(__pa_symbol(_text),
+ (phys_addr_t)__bss_stop - (phys_addr_t)_text);

#ifdef CONFIG_BLK_DEV_INITRD
/* Reserve INITRD */

2012-11-17 00:27:38

by Duyck, Alexander H

[permalink] [raw]
Subject: [tip:x86/mm] x86: Use __pa_symbol instead of __pa on C visible symbols

Commit-ID: 132bc57b030fcdc0968e7cd217e3063f64ec5dce
Gitweb: http://git.kernel.org/tip/132bc57b030fcdc0968e7cd217e3063f64ec5dce
Author: Alexander Duyck <[email protected]>
AuthorDate: Fri, 16 Nov 2012 13:57:13 -0800
Committer: H. Peter Anvin <[email protected]>
CommitDate: Fri, 16 Nov 2012 15:20:42 -0800

x86: Use __pa_symbol instead of __pa on C visible symbols

When I made an attempt at separating __pa_symbol and __pa I found that there
were a number of cases where __pa was used on an obvious symbol.

I also caught one non-obvious case as _brk_start and _brk_end are based on the
address of __brk_base which is a C visible symbol.

In mark_rodata_ro I was able to reduce the overhead of kernel symbol to
virtual memory translation by using a combination of __va(__pa_symbol())
instead of page_address(virt_to_page()).

Signed-off-by: Alexander Duyck <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/kernel/cpu/intel.c | 2 +-
arch/x86/kernel/setup.c | 16 ++++++++--------
arch/x86/mm/init_64.c | 18 ++++++++----------
arch/x86/mm/pageattr.c | 8 ++++----
arch/x86/platform/efi/efi.c | 4 ++--
arch/x86/realmode/init.c | 8 ++++----
6 files changed, 27 insertions(+), 29 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 198e019..2249e7e 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -168,7 +168,7 @@ int __cpuinit ppro_with_ram_bug(void)
#ifdef CONFIG_X86_F00F_BUG
static void __cpuinit trap_init_f00f_bug(void)
{
- __set_fixmap(FIX_F00F_IDT, __pa(&idt_table), PAGE_KERNEL_RO);
+ __set_fixmap(FIX_F00F_IDT, __pa_symbol(idt_table), PAGE_KERNEL_RO);

/*
* Update the IDT descriptor and reload the IDT so that
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index ca45696..2702c5d 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -300,8 +300,8 @@ static void __init cleanup_highmap(void)
static void __init reserve_brk(void)
{
if (_brk_end > _brk_start)
- memblock_reserve(__pa(_brk_start),
- __pa(_brk_end) - __pa(_brk_start));
+ memblock_reserve(__pa_symbol(_brk_start),
+ _brk_end - _brk_start);

/* Mark brk area as locked down and no longer taking any
new allocations */
@@ -761,12 +761,12 @@ void __init setup_arch(char **cmdline_p)
init_mm.end_data = (unsigned long) _edata;
init_mm.brk = _brk_end;

- code_resource.start = virt_to_phys(_text);
- code_resource.end = virt_to_phys(_etext)-1;
- data_resource.start = virt_to_phys(_etext);
- data_resource.end = virt_to_phys(_edata)-1;
- bss_resource.start = virt_to_phys(&__bss_start);
- bss_resource.end = virt_to_phys(&__bss_stop)-1;
+ code_resource.start = __pa_symbol(_text);
+ code_resource.end = __pa_symbol(_etext)-1;
+ data_resource.start = __pa_symbol(_etext);
+ data_resource.end = __pa_symbol(_edata)-1;
+ bss_resource.start = __pa_symbol(__bss_start);
+ bss_resource.end = __pa_symbol(__bss_stop)-1;

#ifdef CONFIG_CMDLINE_BOOL
#ifdef CONFIG_CMDLINE_OVERRIDE
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 3baff25..0374a10 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -770,12 +770,10 @@ void set_kernel_text_ro(void)
void mark_rodata_ro(void)
{
unsigned long start = PFN_ALIGN(_text);
- unsigned long rodata_start =
- ((unsigned long)__start_rodata + PAGE_SIZE - 1) & PAGE_MASK;
+ unsigned long rodata_start = PFN_ALIGN(__start_rodata);
unsigned long end = (unsigned long) &__end_rodata_hpage_align;
- unsigned long text_end = PAGE_ALIGN((unsigned long) &__stop___ex_table);
- unsigned long rodata_end = PAGE_ALIGN((unsigned long) &__end_rodata);
- unsigned long data_start = (unsigned long) &_sdata;
+ unsigned long text_end = PFN_ALIGN(&__stop___ex_table);
+ unsigned long rodata_end = PFN_ALIGN(&__end_rodata);

printk(KERN_INFO "Write protecting the kernel read-only data: %luk\n",
(end - start) >> 10);
@@ -800,12 +798,12 @@ void mark_rodata_ro(void)
#endif

free_init_pages("unused kernel memory",
- (unsigned long) page_address(virt_to_page(text_end)),
- (unsigned long)
- page_address(virt_to_page(rodata_start)));
+ (unsigned long) __va(__pa_symbol(text_end)),
+ (unsigned long) __va(__pa_symbol(rodata_start)));
+
free_init_pages("unused kernel memory",
- (unsigned long) page_address(virt_to_page(rodata_end)),
- (unsigned long) page_address(virt_to_page(data_start)));
+ (unsigned long) __va(__pa_symbol(rodata_end)),
+ (unsigned long) __va(__pa_symbol(_sdata)));
}

#endif
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index a718e0d..40f92f3 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -94,12 +94,12 @@ static inline void split_page_count(int level) { }

static inline unsigned long highmap_start_pfn(void)
{
- return __pa(_text) >> PAGE_SHIFT;
+ return __pa_symbol(_text) >> PAGE_SHIFT;
}

static inline unsigned long highmap_end_pfn(void)
{
- return __pa(roundup(_brk_end, PMD_SIZE)) >> PAGE_SHIFT;
+ return __pa_symbol(roundup(_brk_end, PMD_SIZE)) >> PAGE_SHIFT;
}

#endif
@@ -276,8 +276,8 @@ static inline pgprot_t static_protections(pgprot_t prot, unsigned long address,
* The .rodata section needs to be read-only. Using the pfn
* catches all aliases.
*/
- if (within(pfn, __pa((unsigned long)__start_rodata) >> PAGE_SHIFT,
- __pa((unsigned long)__end_rodata) >> PAGE_SHIFT))
+ if (within(pfn, __pa_symbol(__start_rodata) >> PAGE_SHIFT,
+ __pa_symbol(__end_rodata) >> PAGE_SHIFT))
pgprot_val(forbidden) |= _PAGE_RW;

#if defined(CONFIG_X86_64) && defined(CONFIG_DEBUG_RODATA)
diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index ad44391..1b60026 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -410,8 +410,8 @@ void __init efi_reserve_boot_services(void)
* - Not within any part of the kernel
* - Not the bios reserved area
*/
- if ((start+size >= virt_to_phys(_text)
- && start <= virt_to_phys(_end)) ||
+ if ((start+size >= __pa_symbol(_text)
+ && start <= __pa_symbol(_end)) ||
!e820_all_mapped(start, start+size, E820_RAM) ||
memblock_is_region_reserved(start, size)) {
/* Could not reserve, skip it */
diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
index cbca565..8045026 100644
--- a/arch/x86/realmode/init.c
+++ b/arch/x86/realmode/init.c
@@ -62,9 +62,9 @@ void __init setup_real_mode(void)
__va(real_mode_header->trampoline_header);

#ifdef CONFIG_X86_32
- trampoline_header->start = __pa(startup_32_smp);
+ trampoline_header->start = __pa_symbol(startup_32_smp);
trampoline_header->gdt_limit = __BOOT_DS + 7;
- trampoline_header->gdt_base = __pa(boot_gdt);
+ trampoline_header->gdt_base = __pa_symbol(boot_gdt);
#else
/*
* Some AMD processors will #GP(0) if EFER.LMA is set in WRMSR
@@ -78,8 +78,8 @@ void __init setup_real_mode(void)
*trampoline_cr4_features = read_cr4();

trampoline_pgd = (u64 *) __va(real_mode_header->trampoline_pgd);
- trampoline_pgd[0] = __pa(level3_ident_pgt) + _KERNPG_TABLE;
- trampoline_pgd[511] = __pa(level3_kernel_pgt) + _KERNPG_TABLE;
+ trampoline_pgd[0] = __pa_symbol(level3_ident_pgt) + _KERNPG_TABLE;
+ trampoline_pgd[511] = __pa_symbol(level3_kernel_pgt) + _KERNPG_TABLE;
#endif
}

2012-11-17 00:28:33

by Duyck, Alexander H

[permalink] [raw]
Subject: [tip:x86/mm] x86/ftrace: Use __pa_symbol instead of __pa on C visible symbols

Commit-ID: f38ad48f05973079aff40ac7a5a5ab463f7cee32
Gitweb: http://git.kernel.org/tip/f38ad48f05973079aff40ac7a5a5ab463f7cee32
Author: Alexander Duyck <[email protected]>
AuthorDate: Fri, 16 Nov 2012 13:57:32 -0800
Committer: H. Peter Anvin <[email protected]>
CommitDate: Fri, 16 Nov 2012 15:20:43 -0800

x86/ftrace: Use __pa_symbol instead of __pa on C visible symbols

Instead of using __pa which is meant to be a general function for converting
virtual addresses to physical addresses we can use __pa_symbol which is the
preferred way of decoding kernel text virtual addresses to physical addresses.

In this case we are not directly converting C visible symbols however if we
know that the instruction pointer is somewhere between _text and _etext we
know that we are going to be translating an address form the kernel text
space.

Cc: Steven Rostedt <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Signed-off-by: Alexander Duyck <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/kernel/ftrace.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 1d41402..42a392a 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -89,7 +89,7 @@ do_ftrace_mod_code(unsigned long ip, const void *new_code)
* kernel identity mapping to modify code.
*/
if (within(ip, (unsigned long)_text, (unsigned long)_etext))
- ip = (unsigned long)__va(__pa(ip));
+ ip = (unsigned long)__va(__pa_symbol(ip));

return probe_kernel_write((void *)ip, new_code, MCOUNT_INSN_SIZE);
}
@@ -279,7 +279,7 @@ static int ftrace_write(unsigned long ip, const char *val, int size)
* kernel identity mapping to modify code.
*/
if (within(ip, (unsigned long)_text, (unsigned long)_etext))
- ip = (unsigned long)__va(__pa(ip));
+ ip = (unsigned long)__va(__pa_symbol(ip));

return probe_kernel_write((void *)ip, val, size);
}

2012-11-17 00:29:49

by Duyck, Alexander H

[permalink] [raw]
Subject: [tip:x86/mm] x86/acpi: Use __pa_symbol instead of __pa on C visible symbols

Commit-ID: 40276eb3568125e86014d703bae48dbfcb3c20f3
Gitweb: http://git.kernel.org/tip/40276eb3568125e86014d703bae48dbfcb3c20f3
Author: Alexander Duyck <[email protected]>
AuthorDate: Fri, 16 Nov 2012 13:57:43 -0800
Committer: H. Peter Anvin <[email protected]>
CommitDate: Fri, 16 Nov 2012 15:20:50 -0800

x86/acpi: Use __pa_symbol instead of __pa on C visible symbols

This change just updates one spot where __pa was being used when __pa_symbol
should have been used. By using __pa_symbol we are able to drop a few extra
lines of code as we don't have to test to see if the virtual pointer is a
part of the kernel text or just standard virtual memory.

Cc: Len Brown <[email protected]>
Cc: Pavel Machek <[email protected]>
Acked-by: "Rafael J. Wysocki" <[email protected]>
Signed-off-by: Alexander Duyck <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/kernel/acpi/sleep.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/acpi/sleep.c b/arch/x86/kernel/acpi/sleep.c
index 11676cf..f146a3c 100644
--- a/arch/x86/kernel/acpi/sleep.c
+++ b/arch/x86/kernel/acpi/sleep.c
@@ -69,7 +69,7 @@ int acpi_suspend_lowlevel(void)

#ifndef CONFIG_64BIT
header->pmode_entry = (u32)&wakeup_pmode_return;
- header->pmode_cr3 = (u32)__pa(&initial_page_table);
+ header->pmode_cr3 = (u32)__pa_symbol(initial_page_table);
saved_magic = 0x12345678;
#else /* CONFIG_64BIT */
#ifdef CONFIG_SMP

2012-11-17 00:30:28

by Yinghai Lu

[permalink] [raw]
Subject: Re: [tip:x86/mm] x86: Move some contents of page_64_types.h into pgtable_64.h and page_64.h

On Fri, Nov 16, 2012 at 4:22 PM, tip-bot for Alexander Duyck
<[email protected]> wrote:
> Commit-ID: 66d61384e9b4087f044ce86cb4adb12fe4623a6b
> Gitweb: http://git.kernel.org/tip/66d61384e9b4087f044ce86cb4adb12fe4623a6b
> Author: Alexander Duyck <[email protected]>
> AuthorDate: Fri, 16 Nov 2012 13:53:09 -0800
> Committer: H. Peter Anvin <[email protected]>
> CommitDate: Fri, 16 Nov 2012 15:20:10 -0800
>
> x86: Move some contents of page_64_types.h into pgtable_64.h and page_64.h
>
> This patch is meant to clean-up the fact that we have several functions in
> page_64_types.h which really don't belong there. I found this issue when I
> had tried to replace __phys_addr with an inline function. It resulted in the
> realmode bits generating compile warnings about types. In order to resolve
> that I am relocating the address translation to page_64.h since this is in
> keeping with where these functions are located in 32 bit.
>
> In addtion I have relocated several functions defined in init_64.c to
> pgtable_64.h as this seems to be where most of the functions related to
> memory initialization were already located.
>
> Signed-off-by: Alexander Duyck <[email protected]>
> Link: http://lkml.kernel.org/r/[email protected]
> Signed-off-by: H. Peter Anvin <[email protected]>
> ---
> arch/x86/include/asm/page_64.h | 19 +++++++++++++++++++
> arch/x86/include/asm/page_64_types.h | 22 ----------------------
> arch/x86/include/asm/pgtable_64.h | 5 +++++
> 3 files changed, 24 insertions(+), 22 deletions(-)
>
> diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h
> index 072694e..4150999 100644
> --- a/arch/x86/include/asm/page_64.h
> +++ b/arch/x86/include/asm/page_64.h
> @@ -3,4 +3,23 @@
>
> #include <asm/page_64_types.h>
>
> +#ifndef __ASSEMBLY__
> +
> +/* duplicated to the one in bootmem.h */
> +extern unsigned long max_pfn;
> +extern unsigned long phys_base;
> +
> +extern unsigned long __phys_addr(unsigned long);
> +
> +#define __phys_reloc_hide(x) (x)
> +
> +#ifdef CONFIG_FLATMEM
> +#define pfn_valid(pfn) ((pfn) < max_pfn)
> +#endif
> +
> +void clear_page(void *page);
> +void copy_page(void *to, void *from);
> +
> +#endif /* !__ASSEMBLY__ */
> +
> #endif /* _ASM_X86_PAGE_64_H */
> diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
> index 320f7bb..8b491e6 100644
> --- a/arch/x86/include/asm/page_64_types.h
> +++ b/arch/x86/include/asm/page_64_types.h
> @@ -50,26 +50,4 @@
> #define KERNEL_IMAGE_SIZE (512 * 1024 * 1024)
> #define KERNEL_IMAGE_START _AC(0xffffffff80000000, UL)
>
> -#ifndef __ASSEMBLY__
> -void clear_page(void *page);
> -void copy_page(void *to, void *from);
> -
> -/* duplicated to the one in bootmem.h */
> -extern unsigned long max_pfn;
> -extern unsigned long phys_base;
> -
> -extern unsigned long __phys_addr(unsigned long);
> -#define __phys_reloc_hide(x) (x)
> -
> -#define vmemmap ((struct page *)VMEMMAP_START)
> -
> -extern void init_extra_mapping_uc(unsigned long phys, unsigned long size);
> -extern void init_extra_mapping_wb(unsigned long phys, unsigned long size);
> -
> -#endif /* !__ASSEMBLY__ */
> -
> -#ifdef CONFIG_FLATMEM
> -#define pfn_valid(pfn) ((pfn) < max_pfn)
> -#endif
> -
> #endif /* _ASM_X86_PAGE_64_DEFS_H */
> diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h
> index 47356f9..b5d30ad 100644
> --- a/arch/x86/include/asm/pgtable_64.h
> +++ b/arch/x86/include/asm/pgtable_64.h
> @@ -183,6 +183,11 @@ extern void cleanup_highmap(void);
>
> #define __HAVE_ARCH_PTE_SAME
>
> +#define vmemmap ((struct page *)VMEMMAP_START)
> +
> +extern void init_extra_mapping_uc(unsigned long phys, unsigned long size);
> +extern void init_extra_mapping_wb(unsigned long phys, unsigned long size);
> +
> #endif /* !__ASSEMBLY__ */
>
> #endif /* _ASM_X86_PGTABLE_64_H */
> --

arch/x86/kernel/apic/apic_numachip.c: In function ?map_csrs?:
arch/x86/kernel/apic/apic_numachip.c:158:2: error: implicit
declaration of function ?init_extra_mapping_uc?
[-Werror=implicit-function-declaration]
cc1: some warnings being treated as errors

2012-11-17 00:30:41

by Duyck, Alexander H

[permalink] [raw]
Subject: [tip:x86/mm] x86/lguest: Use __pa_symbol instead of __pa on C visible symbols

Commit-ID: 0fde5066220588e8144b996ac4a3df5a91c93459
Gitweb: http://git.kernel.org/tip/0fde5066220588e8144b996ac4a3df5a91c93459
Author: Alexander Duyck <[email protected]>
AuthorDate: Fri, 16 Nov 2012 13:58:12 -0800
Committer: H. Peter Anvin <[email protected]>
CommitDate: Fri, 16 Nov 2012 15:20:53 -0800

x86/lguest: Use __pa_symbol instead of __pa on C visible symbols

The function lguest_write_cr3 is using __pa to convert swapper_pg_dir and
initial_page_table from virtual addresses to physical. The correct function
to use for these values is __pa_symbol since they are C visible symbols.

Cc: Rusty Russell <[email protected]>
Signed-off-by: Alexander Duyck <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/lguest/boot.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/lguest/boot.c b/arch/x86/lguest/boot.c
index 642d880..139dd35 100644
--- a/arch/x86/lguest/boot.c
+++ b/arch/x86/lguest/boot.c
@@ -552,7 +552,8 @@ static void lguest_write_cr3(unsigned long cr3)
current_cr3 = cr3;

/* These two page tables are simple, linear, and used during boot */
- if (cr3 != __pa(swapper_pg_dir) && cr3 != __pa(initial_page_table))
+ if (cr3 != __pa_symbol(swapper_pg_dir) &&
+ cr3 != __pa_symbol(initial_page_table))
cr3_changed = true;
}

2012-11-17 00:43:12

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [tip:x86/mm] x86: Move some contents of page_64_types.h into pgtable_64.h and page_64.h

On 11/16/2012 04:30 PM, Yinghai Lu wrote:
>
> arch/x86/kernel/apic/apic_numachip.c: In function ?map_csrs?:
> arch/x86/kernel/apic/apic_numachip.c:158:2: error: implicit
> declaration of function ?init_extra_mapping_uc?
> [-Werror=implicit-function-declaration]
> cc1: some warnings being treated as errors
>

A missing #include; I folded the fix and re-pushed the branch.

-hpa

2012-11-17 00:50:12

by Duyck, Alexander H

[permalink] [raw]
Subject: [tip:x86/mm] x86: Move some contents of page_64_types.h into pgtable_64.h and page_64.h

Commit-ID: fb50b020c5331c8c4bee0eb875865f5f8be6c03a
Gitweb: http://git.kernel.org/tip/fb50b020c5331c8c4bee0eb875865f5f8be6c03a
Author: Alexander Duyck <[email protected]>
AuthorDate: Fri, 16 Nov 2012 13:53:09 -0800
Committer: H. Peter Anvin <[email protected]>
CommitDate: Fri, 16 Nov 2012 16:40:34 -0800

x86: Move some contents of page_64_types.h into pgtable_64.h and page_64.h

This patch is meant to clean-up the fact that we have several functions in
page_64_types.h which really don't belong there. I found this issue when I
had tried to replace __phys_addr with an inline function. It resulted in the
realmode bits generating compile warnings about types. In order to resolve
that I am relocating the address translation to page_64.h since this is in
keeping with where these functions are located in 32 bit.

In addtion I have relocated several functions defined in init_64.c to
pgtable_64.h as this seems to be where most of the functions related to
memory initialization were already located.

[ hpa: added missing #include <asm/pgtable.h> to apic_numachip.c,
as reported by Yinghai Lu. ]

Signed-off-by: Alexander Duyck <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: H. Peter Anvin <[email protected]>
Cc: Yinghai Lu <[email protected]>
Cc: Daniel J Blueman <[email protected]>
---
arch/x86/include/asm/page_64.h | 19 +++++++++++++++++++
arch/x86/include/asm/page_64_types.h | 22 ----------------------
arch/x86/include/asm/pgtable_64.h | 5 +++++
arch/x86/kernel/apic/apic_numachip.c | 1 +
4 files changed, 25 insertions(+), 22 deletions(-)

diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h
index 072694e..4150999 100644
--- a/arch/x86/include/asm/page_64.h
+++ b/arch/x86/include/asm/page_64.h
@@ -3,4 +3,23 @@

#include <asm/page_64_types.h>

+#ifndef __ASSEMBLY__
+
+/* duplicated to the one in bootmem.h */
+extern unsigned long max_pfn;
+extern unsigned long phys_base;
+
+extern unsigned long __phys_addr(unsigned long);
+
+#define __phys_reloc_hide(x) (x)
+
+#ifdef CONFIG_FLATMEM
+#define pfn_valid(pfn) ((pfn) < max_pfn)
+#endif
+
+void clear_page(void *page);
+void copy_page(void *to, void *from);
+
+#endif /* !__ASSEMBLY__ */
+
#endif /* _ASM_X86_PAGE_64_H */
diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
index 320f7bb..8b491e6 100644
--- a/arch/x86/include/asm/page_64_types.h
+++ b/arch/x86/include/asm/page_64_types.h
@@ -50,26 +50,4 @@
#define KERNEL_IMAGE_SIZE (512 * 1024 * 1024)
#define KERNEL_IMAGE_START _AC(0xffffffff80000000, UL)

-#ifndef __ASSEMBLY__
-void clear_page(void *page);
-void copy_page(void *to, void *from);
-
-/* duplicated to the one in bootmem.h */
-extern unsigned long max_pfn;
-extern unsigned long phys_base;
-
-extern unsigned long __phys_addr(unsigned long);
-#define __phys_reloc_hide(x) (x)
-
-#define vmemmap ((struct page *)VMEMMAP_START)
-
-extern void init_extra_mapping_uc(unsigned long phys, unsigned long size);
-extern void init_extra_mapping_wb(unsigned long phys, unsigned long size);
-
-#endif /* !__ASSEMBLY__ */
-
-#ifdef CONFIG_FLATMEM
-#define pfn_valid(pfn) ((pfn) < max_pfn)
-#endif
-
#endif /* _ASM_X86_PAGE_64_DEFS_H */
diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h
index 47356f9..b5d30ad 100644
--- a/arch/x86/include/asm/pgtable_64.h
+++ b/arch/x86/include/asm/pgtable_64.h
@@ -183,6 +183,11 @@ extern void cleanup_highmap(void);

#define __HAVE_ARCH_PTE_SAME

+#define vmemmap ((struct page *)VMEMMAP_START)
+
+extern void init_extra_mapping_uc(unsigned long phys, unsigned long size);
+extern void init_extra_mapping_wb(unsigned long phys, unsigned long size);
+
#endif /* !__ASSEMBLY__ */

#endif /* _ASM_X86_PGTABLE_64_H */
diff --git a/arch/x86/kernel/apic/apic_numachip.c b/arch/x86/kernel/apic/apic_numachip.c
index a65829a..ae9196f 100644
--- a/arch/x86/kernel/apic/apic_numachip.c
+++ b/arch/x86/kernel/apic/apic_numachip.c
@@ -27,6 +27,7 @@
#include <asm/apic.h>
#include <asm/ipi.h>
#include <asm/apic_flat_64.h>
+#include <asm/pgtable.h>

static int numachip_system __read_mostly;

2012-11-17 00:51:18

by Duyck, Alexander H

[permalink] [raw]
Subject: [tip:x86/mm] x86: Improve __phys_addr performance by making use of carry flags and inlining

Commit-ID: 0bdf525f04afd3a32c14e5a8778771f9c9e0f074
Gitweb: http://git.kernel.org/tip/0bdf525f04afd3a32c14e5a8778771f9c9e0f074
Author: Alexander Duyck <[email protected]>
AuthorDate: Fri, 16 Nov 2012 13:53:51 -0800
Committer: H. Peter Anvin <[email protected]>
CommitDate: Fri, 16 Nov 2012 16:42:08 -0800

x86: Improve __phys_addr performance by making use of carry flags and inlining

This patch is meant to improve overall system performance when making use of
the __phys_addr call. To do this I have implemented several changes.

First if CONFIG_DEBUG_VIRTUAL is not defined __phys_addr is made an inline,
similar to how this is currently handled in 32 bit. However in order to do
this it is required to export phys_base so that it is available if __phys_addr
is used in kernel modules.

The second change was to streamline the code by making use of the carry flag
on an add operation instead of performing a compare on a 64 bit value. The
advantage to this is that it allows us to significantly reduce the overall
size of the call. On my Xeon E5 system the entire __phys_addr inline call
consumes a little less than 32 bytes and 5 instructions. I also applied
similar logic to the debug version of the function. My testing shows that the
debug version of the function with this patch applied is slightly faster than
the non-debug version without the patch.

Finally I also applied the same logic changes to __virt_addr_valid since it
used the same general code flow as __phys_addr and could achieve similar gains
though these changes.

Signed-off-by: Alexander Duyck <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/include/asm/page_64.h | 14 ++++++++++++++
arch/x86/kernel/x8664_ksyms_64.c | 3 +++
arch/x86/mm/physaddr.c | 40 +++++++++++++++++++++++++---------------
3 files changed, 42 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h
index 4150999..5138174 100644
--- a/arch/x86/include/asm/page_64.h
+++ b/arch/x86/include/asm/page_64.h
@@ -9,7 +9,21 @@
extern unsigned long max_pfn;
extern unsigned long phys_base;

+static inline unsigned long __phys_addr_nodebug(unsigned long x)
+{
+ unsigned long y = x - __START_KERNEL_map;
+
+ /* use the carry flag to determine if x was < __START_KERNEL_map */
+ x = y + ((x > y) ? phys_base : (__START_KERNEL_map - PAGE_OFFSET));
+
+ return x;
+}
+
+#ifdef CONFIG_DEBUG_VIRTUAL
extern unsigned long __phys_addr(unsigned long);
+#else
+#define __phys_addr(x) __phys_addr_nodebug(x)
+#endif

#define __phys_reloc_hide(x) (x)

diff --git a/arch/x86/kernel/x8664_ksyms_64.c b/arch/x86/kernel/x8664_ksyms_64.c
index 1330dd1..b014d94 100644
--- a/arch/x86/kernel/x8664_ksyms_64.c
+++ b/arch/x86/kernel/x8664_ksyms_64.c
@@ -59,6 +59,9 @@ EXPORT_SYMBOL(memcpy);
EXPORT_SYMBOL(__memcpy);
EXPORT_SYMBOL(memmove);

+#ifndef CONFIG_DEBUG_VIRTUAL
+EXPORT_SYMBOL(phys_base);
+#endif
EXPORT_SYMBOL(empty_zero_page);
#ifndef CONFIG_PARAVIRT
EXPORT_SYMBOL(native_load_gs_index);
diff --git a/arch/x86/mm/physaddr.c b/arch/x86/mm/physaddr.c
index d2e2735..fd40d75 100644
--- a/arch/x86/mm/physaddr.c
+++ b/arch/x86/mm/physaddr.c
@@ -8,33 +8,43 @@

#ifdef CONFIG_X86_64

+#ifdef CONFIG_DEBUG_VIRTUAL
unsigned long __phys_addr(unsigned long x)
{
- if (x >= __START_KERNEL_map) {
- x -= __START_KERNEL_map;
- VIRTUAL_BUG_ON(x >= KERNEL_IMAGE_SIZE);
- x += phys_base;
+ unsigned long y = x - __START_KERNEL_map;
+
+ /* use the carry flag to determine if x was < __START_KERNEL_map */
+ if (unlikely(x > y)) {
+ x = y + phys_base;
+
+ VIRTUAL_BUG_ON(y >= KERNEL_IMAGE_SIZE);
} else {
- VIRTUAL_BUG_ON(x < PAGE_OFFSET);
- x -= PAGE_OFFSET;
- VIRTUAL_BUG_ON(!phys_addr_valid(x));
+ x = y + (__START_KERNEL_map - PAGE_OFFSET);
+
+ /* carry flag will be set if starting x was >= PAGE_OFFSET */
+ VIRTUAL_BUG_ON((x > y) || !phys_addr_valid(x));
}
+
return x;
}
EXPORT_SYMBOL(__phys_addr);
+#endif

bool __virt_addr_valid(unsigned long x)
{
- if (x >= __START_KERNEL_map) {
- x -= __START_KERNEL_map;
- if (x >= KERNEL_IMAGE_SIZE)
+ unsigned long y = x - __START_KERNEL_map;
+
+ /* use the carry flag to determine if x was < __START_KERNEL_map */
+ if (unlikely(x > y)) {
+ x = y + phys_base;
+
+ if (y >= KERNEL_IMAGE_SIZE)
return false;
- x += phys_base;
} else {
- if (x < PAGE_OFFSET)
- return false;
- x -= PAGE_OFFSET;
- if (!phys_addr_valid(x))
+ x = y + (__START_KERNEL_map - PAGE_OFFSET);
+
+ /* carry flag will be set if starting x was >= PAGE_OFFSET */
+ if ((x > y) || !phys_addr_valid(x))
return false;
}

2012-11-17 00:52:14

by Duyck, Alexander H

[permalink] [raw]
Subject: [tip:x86/mm] x86: Make it so that __pa_symbol can only process kernel symbols on x86_64

Commit-ID: 7d74275d39def4d3ccc8cf4725388bf79ef13861
Gitweb: http://git.kernel.org/tip/7d74275d39def4d3ccc8cf4725388bf79ef13861
Author: Alexander Duyck <[email protected]>
AuthorDate: Fri, 16 Nov 2012 13:55:46 -0800
Committer: H. Peter Anvin <[email protected]>
CommitDate: Fri, 16 Nov 2012 16:42:09 -0800

x86: Make it so that __pa_symbol can only process kernel symbols on x86_64

I submitted an earlier patch that make __phys_addr an inline. This obviously
results in an increase in the code size. One step I can take to reduce that
is to make it so that the __pa_symbol call does a direct translation for
kernel addresses instead of covering all of virtual memory.

On my system this reduced the size for __pa_symbol from 5 instructions
totalling 30 bytes to 3 instructions totalling 16 bytes.

Signed-off-by: Alexander Duyck <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/include/asm/page.h | 3 ++-
arch/x86/include/asm/page_32.h | 1 +
arch/x86/include/asm/page_64.h | 3 +++
arch/x86/mm/physaddr.c | 11 +++++++++++
4 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/page.h b/arch/x86/include/asm/page.h
index 8ca8283..3698a6a 100644
--- a/arch/x86/include/asm/page.h
+++ b/arch/x86/include/asm/page.h
@@ -44,7 +44,8 @@ static inline void copy_user_page(void *to, void *from, unsigned long vaddr,
* case properly. Once all supported versions of gcc understand it, we can
* remove this Voodoo magic stuff. (i.e. once gcc3.x is deprecated)
*/
-#define __pa_symbol(x) __pa(__phys_reloc_hide((unsigned long)(x)))
+#define __pa_symbol(x) \
+ __phys_addr_symbol(__phys_reloc_hide((unsigned long)(x)))

#define __va(x) ((void *)((unsigned long)(x)+PAGE_OFFSET))

diff --git a/arch/x86/include/asm/page_32.h b/arch/x86/include/asm/page_32.h
index da4e762..4d550d0 100644
--- a/arch/x86/include/asm/page_32.h
+++ b/arch/x86/include/asm/page_32.h
@@ -15,6 +15,7 @@ extern unsigned long __phys_addr(unsigned long);
#else
#define __phys_addr(x) __phys_addr_nodebug(x)
#endif
+#define __phys_addr_symbol(x) __phys_addr(x)
#define __phys_reloc_hide(x) RELOC_HIDE((x), 0)

#ifdef CONFIG_FLATMEM
diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h
index 5138174..0f1ddee 100644
--- a/arch/x86/include/asm/page_64.h
+++ b/arch/x86/include/asm/page_64.h
@@ -21,8 +21,11 @@ static inline unsigned long __phys_addr_nodebug(unsigned long x)

#ifdef CONFIG_DEBUG_VIRTUAL
extern unsigned long __phys_addr(unsigned long);
+extern unsigned long __phys_addr_symbol(unsigned long);
#else
#define __phys_addr(x) __phys_addr_nodebug(x)
+#define __phys_addr_symbol(x) \
+ ((unsigned long)(x) - __START_KERNEL_map + phys_base)
#endif

#define __phys_reloc_hide(x) (x)
diff --git a/arch/x86/mm/physaddr.c b/arch/x86/mm/physaddr.c
index fd40d75..c73fedd 100644
--- a/arch/x86/mm/physaddr.c
+++ b/arch/x86/mm/physaddr.c
@@ -28,6 +28,17 @@ unsigned long __phys_addr(unsigned long x)
return x;
}
EXPORT_SYMBOL(__phys_addr);
+
+unsigned long __phys_addr_symbol(unsigned long x)
+{
+ unsigned long y = x - __START_KERNEL_map;
+
+ /* only check upper bounds since lower bounds will trigger carry */
+ VIRTUAL_BUG_ON(y >= KERNEL_IMAGE_SIZE);
+
+ return y + phys_base;
+}
+EXPORT_SYMBOL(__phys_addr_symbol);
#endif

bool __virt_addr_valid(unsigned long x)

2012-11-17 00:53:34

by Duyck, Alexander H

[permalink] [raw]
Subject: [tip:x86/mm] x86: Drop 4 unnecessary calls to __pa_symbol

Commit-ID: 05a476b6e3795f205806662bf09ab95774266292
Gitweb: http://git.kernel.org/tip/05a476b6e3795f205806662bf09ab95774266292
Author: Alexander Duyck <[email protected]>
AuthorDate: Fri, 16 Nov 2012 13:56:35 -0800
Committer: H. Peter Anvin <[email protected]>
CommitDate: Fri, 16 Nov 2012 16:42:09 -0800

x86: Drop 4 unnecessary calls to __pa_symbol

While debugging the __pa_symbol inline patch I found that there were a couple
spots where __pa_symbol was used as follows:
__pa_symbol(x) - __pa_symbol(y)

The compiler had reduced them to:
x - y

Since we also support a debug case where __pa_symbol is a function call it
would probably be useful to just change the two cases I found so that they are
always just treated as "x - y". As such I am casting the values to
phys_addr_t and then doing simple subtraction so that the correct type and
value is returned.

Signed-off-by: Alexander Duyck <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/kernel/head32.c | 4 ++--
arch/x86/kernel/head64.c | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/head32.c b/arch/x86/kernel/head32.c
index c18f59d..f15db0c 100644
--- a/arch/x86/kernel/head32.c
+++ b/arch/x86/kernel/head32.c
@@ -30,8 +30,8 @@ static void __init i386_default_early_setup(void)

void __init i386_start_kernel(void)
{
- memblock_reserve(__pa_symbol(&_text),
- __pa_symbol(&__bss_stop) - __pa_symbol(&_text));
+ memblock_reserve(__pa_symbol(_text),
+ (phys_addr_t)__bss_stop - (phys_addr_t)_text);

#ifdef CONFIG_BLK_DEV_INITRD
/* Reserve INITRD */
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 037df57..42f5df1 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -97,8 +97,8 @@ void __init x86_64_start_reservations(char *real_mode_data)
{
copy_bootdata(__va(real_mode_data));

- memblock_reserve(__pa_symbol(&_text),
- __pa_symbol(&__bss_stop) - __pa_symbol(&_text));
+ memblock_reserve(__pa_symbol(_text),
+ (phys_addr_t)__bss_stop - (phys_addr_t)_text);

#ifdef CONFIG_BLK_DEV_INITRD
/* Reserve INITRD */

2012-11-17 00:54:38

by Duyck, Alexander H

[permalink] [raw]
Subject: [tip:x86/mm] x86: Use __pa_symbol instead of __pa on C visible symbols

Commit-ID: fc8d782677f163dee76427fdd8a92bebd2b50b23
Gitweb: http://git.kernel.org/tip/fc8d782677f163dee76427fdd8a92bebd2b50b23
Author: Alexander Duyck <[email protected]>
AuthorDate: Fri, 16 Nov 2012 13:57:13 -0800
Committer: H. Peter Anvin <[email protected]>
CommitDate: Fri, 16 Nov 2012 16:42:09 -0800

x86: Use __pa_symbol instead of __pa on C visible symbols

When I made an attempt at separating __pa_symbol and __pa I found that there
were a number of cases where __pa was used on an obvious symbol.

I also caught one non-obvious case as _brk_start and _brk_end are based on the
address of __brk_base which is a C visible symbol.

In mark_rodata_ro I was able to reduce the overhead of kernel symbol to
virtual memory translation by using a combination of __va(__pa_symbol())
instead of page_address(virt_to_page()).

Signed-off-by: Alexander Duyck <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/kernel/cpu/intel.c | 2 +-
arch/x86/kernel/setup.c | 16 ++++++++--------
arch/x86/mm/init_64.c | 18 ++++++++----------
arch/x86/mm/pageattr.c | 8 ++++----
arch/x86/platform/efi/efi.c | 4 ++--
arch/x86/realmode/init.c | 8 ++++----
6 files changed, 27 insertions(+), 29 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 198e019..2249e7e 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -168,7 +168,7 @@ int __cpuinit ppro_with_ram_bug(void)
#ifdef CONFIG_X86_F00F_BUG
static void __cpuinit trap_init_f00f_bug(void)
{
- __set_fixmap(FIX_F00F_IDT, __pa(&idt_table), PAGE_KERNEL_RO);
+ __set_fixmap(FIX_F00F_IDT, __pa_symbol(idt_table), PAGE_KERNEL_RO);

/*
* Update the IDT descriptor and reload the IDT so that
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index ca45696..2702c5d 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -300,8 +300,8 @@ static void __init cleanup_highmap(void)
static void __init reserve_brk(void)
{
if (_brk_end > _brk_start)
- memblock_reserve(__pa(_brk_start),
- __pa(_brk_end) - __pa(_brk_start));
+ memblock_reserve(__pa_symbol(_brk_start),
+ _brk_end - _brk_start);

/* Mark brk area as locked down and no longer taking any
new allocations */
@@ -761,12 +761,12 @@ void __init setup_arch(char **cmdline_p)
init_mm.end_data = (unsigned long) _edata;
init_mm.brk = _brk_end;

- code_resource.start = virt_to_phys(_text);
- code_resource.end = virt_to_phys(_etext)-1;
- data_resource.start = virt_to_phys(_etext);
- data_resource.end = virt_to_phys(_edata)-1;
- bss_resource.start = virt_to_phys(&__bss_start);
- bss_resource.end = virt_to_phys(&__bss_stop)-1;
+ code_resource.start = __pa_symbol(_text);
+ code_resource.end = __pa_symbol(_etext)-1;
+ data_resource.start = __pa_symbol(_etext);
+ data_resource.end = __pa_symbol(_edata)-1;
+ bss_resource.start = __pa_symbol(__bss_start);
+ bss_resource.end = __pa_symbol(__bss_stop)-1;

#ifdef CONFIG_CMDLINE_BOOL
#ifdef CONFIG_CMDLINE_OVERRIDE
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 3baff25..0374a10 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -770,12 +770,10 @@ void set_kernel_text_ro(void)
void mark_rodata_ro(void)
{
unsigned long start = PFN_ALIGN(_text);
- unsigned long rodata_start =
- ((unsigned long)__start_rodata + PAGE_SIZE - 1) & PAGE_MASK;
+ unsigned long rodata_start = PFN_ALIGN(__start_rodata);
unsigned long end = (unsigned long) &__end_rodata_hpage_align;
- unsigned long text_end = PAGE_ALIGN((unsigned long) &__stop___ex_table);
- unsigned long rodata_end = PAGE_ALIGN((unsigned long) &__end_rodata);
- unsigned long data_start = (unsigned long) &_sdata;
+ unsigned long text_end = PFN_ALIGN(&__stop___ex_table);
+ unsigned long rodata_end = PFN_ALIGN(&__end_rodata);

printk(KERN_INFO "Write protecting the kernel read-only data: %luk\n",
(end - start) >> 10);
@@ -800,12 +798,12 @@ void mark_rodata_ro(void)
#endif

free_init_pages("unused kernel memory",
- (unsigned long) page_address(virt_to_page(text_end)),
- (unsigned long)
- page_address(virt_to_page(rodata_start)));
+ (unsigned long) __va(__pa_symbol(text_end)),
+ (unsigned long) __va(__pa_symbol(rodata_start)));
+
free_init_pages("unused kernel memory",
- (unsigned long) page_address(virt_to_page(rodata_end)),
- (unsigned long) page_address(virt_to_page(data_start)));
+ (unsigned long) __va(__pa_symbol(rodata_end)),
+ (unsigned long) __va(__pa_symbol(_sdata)));
}

#endif
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index a718e0d..40f92f3 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -94,12 +94,12 @@ static inline void split_page_count(int level) { }

static inline unsigned long highmap_start_pfn(void)
{
- return __pa(_text) >> PAGE_SHIFT;
+ return __pa_symbol(_text) >> PAGE_SHIFT;
}

static inline unsigned long highmap_end_pfn(void)
{
- return __pa(roundup(_brk_end, PMD_SIZE)) >> PAGE_SHIFT;
+ return __pa_symbol(roundup(_brk_end, PMD_SIZE)) >> PAGE_SHIFT;
}

#endif
@@ -276,8 +276,8 @@ static inline pgprot_t static_protections(pgprot_t prot, unsigned long address,
* The .rodata section needs to be read-only. Using the pfn
* catches all aliases.
*/
- if (within(pfn, __pa((unsigned long)__start_rodata) >> PAGE_SHIFT,
- __pa((unsigned long)__end_rodata) >> PAGE_SHIFT))
+ if (within(pfn, __pa_symbol(__start_rodata) >> PAGE_SHIFT,
+ __pa_symbol(__end_rodata) >> PAGE_SHIFT))
pgprot_val(forbidden) |= _PAGE_RW;

#if defined(CONFIG_X86_64) && defined(CONFIG_DEBUG_RODATA)
diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index ad44391..1b60026 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -410,8 +410,8 @@ void __init efi_reserve_boot_services(void)
* - Not within any part of the kernel
* - Not the bios reserved area
*/
- if ((start+size >= virt_to_phys(_text)
- && start <= virt_to_phys(_end)) ||
+ if ((start+size >= __pa_symbol(_text)
+ && start <= __pa_symbol(_end)) ||
!e820_all_mapped(start, start+size, E820_RAM) ||
memblock_is_region_reserved(start, size)) {
/* Could not reserve, skip it */
diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
index cbca565..8045026 100644
--- a/arch/x86/realmode/init.c
+++ b/arch/x86/realmode/init.c
@@ -62,9 +62,9 @@ void __init setup_real_mode(void)
__va(real_mode_header->trampoline_header);

#ifdef CONFIG_X86_32
- trampoline_header->start = __pa(startup_32_smp);
+ trampoline_header->start = __pa_symbol(startup_32_smp);
trampoline_header->gdt_limit = __BOOT_DS + 7;
- trampoline_header->gdt_base = __pa(boot_gdt);
+ trampoline_header->gdt_base = __pa_symbol(boot_gdt);
#else
/*
* Some AMD processors will #GP(0) if EFER.LMA is set in WRMSR
@@ -78,8 +78,8 @@ void __init setup_real_mode(void)
*trampoline_cr4_features = read_cr4();

trampoline_pgd = (u64 *) __va(real_mode_header->trampoline_pgd);
- trampoline_pgd[0] = __pa(level3_ident_pgt) + _KERNPG_TABLE;
- trampoline_pgd[511] = __pa(level3_kernel_pgt) + _KERNPG_TABLE;
+ trampoline_pgd[0] = __pa_symbol(level3_ident_pgt) + _KERNPG_TABLE;
+ trampoline_pgd[511] = __pa_symbol(level3_kernel_pgt) + _KERNPG_TABLE;
#endif
}

2012-11-17 00:55:33

by Duyck, Alexander H

[permalink] [raw]
Subject: [tip:x86/mm] x86/ftrace: Use __pa_symbol instead of __pa on C visible symbols

Commit-ID: 217f155e9fc68bf2a6c58a7b47e0d1ce68d78818
Gitweb: http://git.kernel.org/tip/217f155e9fc68bf2a6c58a7b47e0d1ce68d78818
Author: Alexander Duyck <[email protected]>
AuthorDate: Fri, 16 Nov 2012 13:57:32 -0800
Committer: H. Peter Anvin <[email protected]>
CommitDate: Fri, 16 Nov 2012 16:42:09 -0800

x86/ftrace: Use __pa_symbol instead of __pa on C visible symbols

Instead of using __pa which is meant to be a general function for converting
virtual addresses to physical addresses we can use __pa_symbol which is the
preferred way of decoding kernel text virtual addresses to physical addresses.

In this case we are not directly converting C visible symbols however if we
know that the instruction pointer is somewhere between _text and _etext we
know that we are going to be translating an address form the kernel text
space.

Cc: Steven Rostedt <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Signed-off-by: Alexander Duyck <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/kernel/ftrace.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 1d41402..42a392a 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -89,7 +89,7 @@ do_ftrace_mod_code(unsigned long ip, const void *new_code)
* kernel identity mapping to modify code.
*/
if (within(ip, (unsigned long)_text, (unsigned long)_etext))
- ip = (unsigned long)__va(__pa(ip));
+ ip = (unsigned long)__va(__pa_symbol(ip));

return probe_kernel_write((void *)ip, new_code, MCOUNT_INSN_SIZE);
}
@@ -279,7 +279,7 @@ static int ftrace_write(unsigned long ip, const char *val, int size)
* kernel identity mapping to modify code.
*/
if (within(ip, (unsigned long)_text, (unsigned long)_etext))
- ip = (unsigned long)__va(__pa(ip));
+ ip = (unsigned long)__va(__pa_symbol(ip));

return probe_kernel_write((void *)ip, val, size);
}

2012-11-17 00:56:34

by Duyck, Alexander H

[permalink] [raw]
Subject: [tip:x86/mm] x86/acpi: Use __pa_symbol instead of __pa on C visible symbols

Commit-ID: afd51a0e32cd79261f0e823400886ed322a355ac
Gitweb: http://git.kernel.org/tip/afd51a0e32cd79261f0e823400886ed322a355ac
Author: Alexander Duyck <[email protected]>
AuthorDate: Fri, 16 Nov 2012 13:57:43 -0800
Committer: H. Peter Anvin <[email protected]>
CommitDate: Fri, 16 Nov 2012 16:42:10 -0800

x86/acpi: Use __pa_symbol instead of __pa on C visible symbols

This change just updates one spot where __pa was being used when __pa_symbol
should have been used. By using __pa_symbol we are able to drop a few extra
lines of code as we don't have to test to see if the virtual pointer is a
part of the kernel text or just standard virtual memory.

Cc: Len Brown <[email protected]>
Cc: Pavel Machek <[email protected]>
Acked-by: "Rafael J. Wysocki" <[email protected]>
Signed-off-by: Alexander Duyck <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/kernel/acpi/sleep.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/acpi/sleep.c b/arch/x86/kernel/acpi/sleep.c
index 11676cf..f146a3c 100644
--- a/arch/x86/kernel/acpi/sleep.c
+++ b/arch/x86/kernel/acpi/sleep.c
@@ -69,7 +69,7 @@ int acpi_suspend_lowlevel(void)

#ifndef CONFIG_64BIT
header->pmode_entry = (u32)&wakeup_pmode_return;
- header->pmode_cr3 = (u32)__pa(&initial_page_table);
+ header->pmode_cr3 = (u32)__pa_symbol(initial_page_table);
saved_magic = 0x12345678;
#else /* CONFIG_64BIT */
#ifdef CONFIG_SMP