Subject: [tip:x86/asm] x86/asm: Optimize clear_page()

Commit-ID: 49ca7bb328c630dd43be626534b49e19513296fd
Gitweb: http://git.kernel.org/tip/49ca7bb328c630dd43be626534b49e19513296fd
Author: Borislav Petkov <[email protected]>
AuthorDate: Thu, 9 Feb 2017 01:34:49 +0100
Committer: Ingo Molnar <[email protected]>
CommitDate: Wed, 1 Mar 2017 10:18:32 +0100

x86/asm: Optimize clear_page()

Currently, we CALL clear_page() which then JMPs to the proper function
chosen by the alternatives.

What we should do instead is CALL the proper function directly. (This
was something Ingo suggested a while ago). So let's do that.

Measuring our favourite kernel build workload shows that there are no
significant changes in performance.

AMD
===
-- /tmp/before 2017-02-09 18:01:46.451961188 +0100
++ /tmp/after 2017-02-09 18:01:54.883961175 +0100
@@ -1,15 +1,15 @@
Performance counter stats for 'system wide' (5 runs):

- 1028960.373643 cpu-clock (msec) # 6.000 CPUs utilized ( +- 1.41% )
+ 1023086.018961 cpu-clock (msec) # 6.000 CPUs utilized ( +- 1.20% )
- 518,744 context-switches # 0.504 K/sec ( +- 1.04% )
+ 518,254 context-switches # 0.507 K/sec ( +- 1.01% )
- 38,112 cpu-migrations # 0.037 K/sec ( +- 1.95% )
+ 37,917 cpu-migrations # 0.037 K/sec ( +- 1.02% )
- 20,874,266 page-faults # 0.020 M/sec ( +- 0.07% )
+ 20,918,897 page-faults # 0.020 M/sec ( +- 0.18% )
- 2,043,646,230,667 cycles # 1.986 GHz ( +- 0.14% ) (66.67%)
+ 2,045,305,584,032 cycles # 1.999 GHz ( +- 0.16% ) (66.67%)
- 553,698,855,431 stalled-cycles-frontend # 27.09% frontend cycles idle ( +- 0.07% ) (66.67%)
+ 555,099,401,413 stalled-cycles-frontend # 27.14% frontend cycles idle ( +- 0.13% ) (66.67%)
- 621,544,286,390 stalled-cycles-backend # 30.41% backend cycles idle ( +- 0.39% ) (66.67%)
+ 621,371,430,254 stalled-cycles-backend # 30.38% backend cycles idle ( +- 0.32% ) (66.67%)
- 1,738,364,431,659 instructions # 0.85 insn per cycle
+ 1,739,895,771,901 instructions # 0.85 insn per cycle
- # 0.36 stalled cycles per insn ( +- 0.11% ) (66.67%)
+ # 0.36 stalled cycles per insn ( +- 0.13% ) (66.67%)
- 391,170,943,850 branches # 380.161 M/sec ( +- 0.13% ) (66.67%)
+ 391,398,551,757 branches # 382.567 M/sec ( +- 0.13% ) (66.67%)
- 22,567,810,411 branch-misses # 5.77% of all branches ( +- 0.11% ) (66.67%)
+ 22,574,726,683 branch-misses # 5.77% of all branches ( +- 0.13% ) (66.67%)

- 171.480741921 seconds time elapsed ( +- 1.41% )
+ 170.509229451 seconds time elapsed ( +- 1.20% )

Intel
=====

-- /tmp/before 2017-02-09 20:36:19.851947473 +0100
++ /tmp/after 2017-02-09 20:36:30.151947458 +0100
@@ -1,15 +1,15 @@
Performance counter stats for 'system wide' (5 runs):

- 2207248.598126 cpu-clock (msec) # 8.000 CPUs utilized ( +- 0.69% )
+ 2213300.106631 cpu-clock (msec) # 8.000 CPUs utilized ( +- 0.73% )
- 899,342 context-switches # 0.407 K/sec ( +- 0.68% )
+ 898,381 context-switches # 0.406 K/sec ( +- 0.79% )
- 80,553 cpu-migrations # 0.036 K/sec ( +- 1.13% )
+ 80,979 cpu-migrations # 0.037 K/sec ( +- 1.11% )
- 36,171,148 page-faults # 0.016 M/sec ( +- 0.02% )
+ 36,179,791 page-faults # 0.016 M/sec ( +- 0.02% )
- 6,665,288,826,484 cycles # 3.020 GHz ( +- 0.07% ) (83.33%)
+ 6,671,638,410,799 cycles # 3.014 GHz ( +- 0.06% ) (83.33%)
- 5,065,975,115,197 stalled-cycles-frontend # 76.01% frontend cycles idle ( +- 0.11% ) (83.33%)
+ 5,076,835,183,223 stalled-cycles-frontend # 76.10% frontend cycles idle ( +- 0.11% ) (83.33%)
- 3,841,556,350,614 stalled-cycles-backend # 57.64% backend cycles idle ( +- 0.13% ) (66.67%)
+ 3,852,823,974,333 stalled-cycles-backend # 57.75% backend cycles idle ( +- 0.12% ) (66.67%)
- 4,148,398,171,079 instructions # 0.62 insn per cycle
+ 4,148,997,156,059 instructions # 0.62 insn per cycle
- # 1.22 stalled cycles per insn ( +- 0.10% ) (83.33%)
+ # 1.22 stalled cycles per insn ( +- 0.11% ) (83.33%)
- 887,187,118,591 branches # 401.943 M/sec ( +- 0.09% ) (83.33%)
+ 887,271,341,121 branches # 400.882 M/sec ( +- 0.11% ) (83.33%)
- 30,139,439,034 branch-misses # 3.40% of all branches ( +- 0.09% ) (83.33%)
+ 30,134,864,997 branch-misses # 3.40% of all branches ( +- 0.06% ) (83.33%)

- 275.904405540 seconds time elapsed ( +- 0.69% )
+ 276.660352016 seconds time elapsed ( +- 0.73% )

allmodconfig vmlinux size grows by a ~1Kb but that's fine - we optimize
our calling of the clear_page variants.

text data bss dec hex filename
9051979 23067670 27009024 59128673 3863b61 vmlinux
9053000 23067670 27009024 59129694 3863f5e vmlinux.clear_page

Reported-by: kernel test robot <[email protected]>
Tested-by: Fengguang Wu <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/alternative.h | 17 +++++++++++++++++
arch/x86/include/asm/page_64.h | 15 ++++++++++++++-
arch/x86/lib/clear_page_64.S | 17 +++++++----------
3 files changed, 38 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
index 1b02038..12e3d8d 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -227,6 +227,23 @@ static inline int alternatives_text_reserved(void *start, void *end)
}

/*
+ * Like alternative_call(), but there are two features and respective functions.
+ * If CPU has feature2, function2 is used.
+ * Otherwise, if CPU has feature1, function1 is used.
+ * Otherwise, old function is used.
+ */
+#define alternative_void_call_2(oldfunc, newfunc1, feature1, newfunc2, \
+ feature2, input...) \
+{ \
+ register void *__sp asm(_ASM_SP); \
+ asm volatile (ALTERNATIVE_2("call %P[old]", "call %P[new1]", feature1, \
+ "call %P[new2]", feature2) \
+ : "+r" (__sp) \
+ : [old] "i" (oldfunc), [new1] "i" (newfunc1), \
+ [new2] "i" (newfunc2), ## input); \
+}
+
+/*
* use this macro(s) if you need more than one output parameter
* in alternative_io
*/
diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h
index b3bebf9..254abce 100644
--- a/arch/x86/include/asm/page_64.h
+++ b/arch/x86/include/asm/page_64.h
@@ -4,6 +4,7 @@
#include <asm/page_64_types.h>

#ifndef __ASSEMBLY__
+#include <asm/alternative.h>

/* duplicated to the one in bootmem.h */
extern unsigned long max_pfn;
@@ -34,7 +35,19 @@ extern unsigned long __phys_addr_symbol(unsigned long);
#define pfn_valid(pfn) ((pfn) < max_pfn)
#endif

-void clear_page(void *page);
+void clear_page_orig(void *page);
+void clear_page_rep(void *page);
+void clear_page_erms(void *page);
+
+static inline void clear_page(void *page)
+{
+ alternative_void_call_2(clear_page_orig,
+ clear_page_rep, X86_FEATURE_REP_GOOD,
+ clear_page_erms, X86_FEATURE_ERMS,
+ "D" (page)
+ : "memory", "rax", "rcx");
+}
+
void copy_page(void *to, void *from);

#endif /* !__ASSEMBLY__ */
diff --git a/arch/x86/lib/clear_page_64.S b/arch/x86/lib/clear_page_64.S
index 5e2af3a..81b1635 100644
--- a/arch/x86/lib/clear_page_64.S
+++ b/arch/x86/lib/clear_page_64.S
@@ -14,20 +14,15 @@
* Zero a page.
* %rdi - page
*/
-ENTRY(clear_page)
-
- ALTERNATIVE_2 "jmp clear_page_orig", "", X86_FEATURE_REP_GOOD, \
- "jmp clear_page_c_e", X86_FEATURE_ERMS
-
+ENTRY(clear_page_rep)
movl $4096/8,%ecx
xorl %eax,%eax
rep stosq
ret
-ENDPROC(clear_page)
-EXPORT_SYMBOL(clear_page)
+ENDPROC(clear_page_rep)
+EXPORT_SYMBOL_GPL(clear_page_rep)

ENTRY(clear_page_orig)
-
xorl %eax,%eax
movl $4096/64,%ecx
.p2align 4
@@ -47,10 +42,12 @@ ENTRY(clear_page_orig)
nop
ret
ENDPROC(clear_page_orig)
+EXPORT_SYMBOL_GPL(clear_page_orig)

-ENTRY(clear_page_c_e)
+ENTRY(clear_page_erms)
movl $4096,%ecx
xorl %eax,%eax
rep stosb
ret
-ENDPROC(clear_page_c_e)
+ENDPROC(clear_page_erms)
+EXPORT_SYMBOL_GPL(clear_page_erms)


2017-03-07 05:46:17

by Yinghai Lu

[permalink] [raw]
Subject: Re: [tip:x86/asm] x86/asm: Optimize clear_page()

On Wed, Mar 1, 2017 at 1:47 AM, tip-bot for Borislav Petkov
<[email protected]> wrote:
> Commit-ID: 49ca7bb328c630dd43be626534b49e19513296fd
> Gitweb: http://git.kernel.org/tip/49ca7bb328c630dd43be626534b49e19513296fd
> Author: Borislav Petkov <[email protected]>
> AuthorDate: Thu, 9 Feb 2017 01:34:49 +0100
> Committer: Ingo Molnar <[email protected]>
> CommitDate: Wed, 1 Mar 2017 10:18:32 +0100
>
> x86/asm: Optimize clear_page()
>
> Currently, we CALL clear_page() which then JMPs to the proper function
> chosen by the alternatives.
>
> What we should do instead is CALL the proper function directly. (This
> was something Ingo suggested a while ago). So let's do that.

looks like this one broke the kexec.
after revert it back, kexec work again.

10:~/k # sh kk
add_buffer: base:43fff6000 bufsz:80e0 memsz:a000
add_buffer: base:43fff1000 bufsz:44ce memsz:44ce
add_buffer: base:43c000000 bufsz:eb2360 memsz:352e000
add_buffer: base:439d0d000 bufsz:22f2060 memsz:22f2060
add_buffer: base:43fff0000 bufsz:70 memsz:70
add_buffer: base:43ffef000 bufsz:140 memsz:140
10:~/k # [ 79.250483] BUG: unable to handle kernel paging request at
ffffc467661dc038
[ 79.251562] IP: __handle_mm_fault+0x256/0x910
[ 79.252157] PGD 0
[ 79.252159]
[ 79.252733] Oops: 0000 [#1] SMP
[ 79.253243] Modules linked in:
[ 79.253718] CPU: 4 PID: 5593 Comm: hald-addon-stor Not tainted
4.11.0-rc1-yh-00100-g00db9e3-dirty #175
[ 79.255054] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
[ 79.256069] task: ffff8b43794c0000 task.stack: ffffb30dc6dac000
[ 79.256887] RIP: 0010:__handle_mm_fault+0x256/0x910
[ 79.257545] RSP: 0000:ffffb30dc6dafdd0 EFLAGS: 00010282
[ 79.258225] RAX: 00003928261dc000 RBX: ffff8b417a38dcf0 RCX: 00003ffffffff000
[ 79.259175] RDX: 09cc3928261dcc7c RSI: 09cc3928261dcc7c RDI: ffffb30dc6dafe48
[ 79.260126] RBP: ffffb30dc6dafe70 R08: 0000000000000001 R09: ffff8b43794c0c60
[ 79.261095] R10: 000000003638e619 R11: 0000000000000001 R12: ffff8b427a72a538
[ 79.261963] R13: ffffc467661dc038 R14: ffffb30dc6dafde0 R15: 0000000000000154
[ 79.262903] FS: 00007f29c1ce4740(0000) GS:ffff8b427ba00000(0000)
knlGS:0000000000000000
[ 79.263973] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 79.264741] CR2: ffffc467661dc038 CR3: 000000033a512000 CR4: 00000000000006e0
[ 79.265679] Call Trace:
[ 79.266003] ? handle_mm_fault+0x138/0x320
[ 79.266431] handle_mm_fault+0x247/0x320
[ 79.266968] ? handle_mm_fault+0x47/0x320
[ 79.267491] __do_page_fault+0x49f/0x500
[ 79.268039] do_page_fault+0x65/0x80
[ 79.268508] page_fault+0x22/0x30
[ 79.268975] RIP: 0033:0x7f29c0ed53e8
[ 79.269443] RSP: 002b:00007ffe63a0e080 EFLAGS: 00010246
[ 79.271605] RAX: 0000000000000000 RBX: 00000000000007c7 RCX: 00007f29c0ed53e8
[ 79.272794] RDX: 00000000000007c7 RSI: 0000000000000002 RDI: 000000000060d0e0
[ 79.273741] RBP: 0000000000000002 R08: 00007f29c1457de0 R09: 0000000000000000
[ 79.274698] R10: 0000000000000001 R11: 0000000000000246 R12: 000000000060ac20
[ 79.275648] R13: 000000000060d0e0 R14: 000000000060ac28 R15: 00007f29c1457de0
[ 79.276596] Code: 3f 00 00 41 81 e5 f8 0f 00 00 f6 c2 80 48 0f 44
c1 4c 03 2d 25 9d ca 01 48 21 d0 49 01 c5 4d 85 ed 4c 89 6d 90 0f 84
d1 04 00 00 <49> 8b 75 00 48 f7 c6 9f ff ff ff 75 6a 48 8b 05 be 35 eb
01 a8
[ 79.279121] RIP: __handle_mm_fault+0x256/0x910 RSP: ffffb30dc6dafdd0
[ 79.279965] CR2: ffffc467661dc038
[ 79.280403] ---[ end trace 7bd128a831f77757 ]---
[ 79.298303] general protection fault: 0000 [#2] SMP
[ 79.298997] Modules linked in:
[ 79.299402] CPU: 4 PID: 5593 Comm: hald-addon-stor Tainted: G
D 4.11.0-rc1-yh-00100-g00db9e3-dirty #175
[ 79.300794] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
[ 79.301707] task: ffff8b43794c0000 task.stack: ffffb30dc6dac000
[ 79.302502] RIP: 0010:__wake_up_common+0x4a/0x90
[ 79.303133] RSP: 0000:ffff8b427ba03de0 EFLAGS: 00010006
[ 79.303807] RAX: ffffb30dc6263da0 RBX: 00000000765622af RCX: 0000000000000000
[ 79.304769] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffb30dc6263da0
[ 79.305730] RBP: ffff8b427ba03e18 R08: 0000000000000000 R09: 0000000000000001
[ 79.306691] R10: 0000000000000000 R11: 000000000e2e7ae4 R12: ffffffffafe71d08
[ 79.307642] R13: 58e0432d872b20f9 R14: 0000000000000000 R15: 0000000000000001
[ 79.308571] FS: 00007f29c1ce4740(0000) GS:ffff8b427ba00000(0000)
knlGS:0000000000000000
[ 79.309653] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 79.310434] CR2: ffffc467661dc038 CR3: 000000033a512000 CR4: 00000000000006e0
[ 79.311398] Call Trace:
[ 79.311724] <IRQ>
[ 79.311998] __wake_up+0x39/0x50
[ 79.312458] wake_up_klogd_work_func+0x52/0x60
[ 79.313119] irq_work_run_list+0x43/0x70
[ 79.313634] ? tick_sched_handle.isra.16+0x50/0x50
[ 79.314289] irq_work_tick+0x40/0x50
[ 79.314754] update_process_times+0x42/0x60
[ 79.315332] tick_sched_handle.isra.16+0x41/0x50
[ 79.315933] tick_sched_timer+0x3d/0x70
[ 79.316472] __hrtimer_run_queues+0x264/0x440
[ 79.317046] hrtimer_interrupt+0xb5/0x1c0
[ 79.317601] local_apic_timer_interrupt+0x4d/0x60
[ 79.318213] smp_apic_timer_interrupt+0x38/0x50
[ 79.318803] apic_timer_interrupt+0x95/0xa0
[ 79.319386] RIP: 0010:_raw_spin_unlock_irq+0x2e/0x30
[ 79.320038] RSP: 0000:ffffb30dc6dafe98 EFLAGS: 00000246 ORIG_RAX:
ffffffffffffff10
[ 79.321051] RAX: 0000000000000004 RBX: ffff8b437a150a80 RCX: 0000000000000000
[ 79.322015] RDX: ffffffffae101c6a RSI: ffffffffaf2a78bc RDI: ffffffffae0c1ced
[ 79.322967] RBP: ffffb30dc6dafe98 R08: 0000000000000001 R09: 0000000000000001
[ 79.323931] R10: 0000000000000000 R11: 00000000000015d9 R12: ffff8b43794c0000
[ 79.324882] R13: 0000000000000009 R14: 0000000000007000 R15: 0000000000000046
[ 79.325835] </IRQ>
[ 79.326122] ? acct_collect+0x16a/0x1c0
[ 79.326653] ? _raw_spin_unlock_irq+0x2c/0x30
[ 79.327222] ? trace_hardirqs_on+0xd/0x10
[ 79.327780] acct_collect+0x16a/0x1c0
[ 79.328268] do_exit+0x207/0xb60
[ 79.328726] rewind_stack_do_exit+0x17/0x20
[ 79.329272] RIP: 0033:0x7f29c0ed53e8
[ 79.329774] RSP: 002b:00007ffe63a0e080 EFLAGS: 00010246
[ 79.330487] RAX: 0000000000000000 RBX: 00000000000007c7 RCX: 00007f29c0ed53e8
[ 79.331413] RDX: 00000000000007c7 RSI: 0000000000000002 RDI: 000000000060d0e0
[ 79.332361] RBP: 0000000000000002 R08: 00007f29c1457de0 R09: 0000000000000000
[ 79.333314] R10: 0000000000000001 R11: 0000000000000246 R12: 000000000060ac20
[ 79.334319] R13: 000000000060d0e0 R14: 000000000060ac28 R15: 00007f29c1457de0
[ 79.335272] Code: 10 89 55 cc 48 8b 57 48 4c 89 45 d0 48 8b 0a 49
39 d4 48 8d 42 e8 4c 8d 69 e8 74 3a 8b 18 48 8b 4d d0 44 89 f2 44 89
fe 48 89 c7 <ff> 50 10 85 c0 74 0b 83 e3 01 74 06 83 6d cc 01 74 19 49
8b 45
[ 79.337814] RIP: __wake_up_common+0x4a/0x90 RSP: ffff8b427ba03de0
[ 79.338630] ---[ end trace 7bd128a831f77758 ]---
[ 79.355927] Kernel panic - not syncing: Fatal exception in interrupt
[ 79.356995] Kernel Offset: 0x2d000000 from 0xffffffff81000000
(relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 79.374339] ---[ end Kernel panic - not syncing: Fatal exception in interrupt

2017-03-07 15:19:36

by Ingo Molnar

[permalink] [raw]
Subject: Re: [tip:x86/asm] x86/asm: Optimize clear_page()


* Yinghai Lu <[email protected]> wrote:

> On Wed, Mar 1, 2017 at 1:47 AM, tip-bot for Borislav Petkov
> <[email protected]> wrote:
> > Commit-ID: 49ca7bb328c630dd43be626534b49e19513296fd
> > Gitweb: http://git.kernel.org/tip/49ca7bb328c630dd43be626534b49e19513296fd
> > Author: Borislav Petkov <[email protected]>
> > AuthorDate: Thu, 9 Feb 2017 01:34:49 +0100
> > Committer: Ingo Molnar <[email protected]>
> > CommitDate: Wed, 1 Mar 2017 10:18:32 +0100
> >
> > x86/asm: Optimize clear_page()
> >
> > Currently, we CALL clear_page() which then JMPs to the proper function
> > chosen by the alternatives.
> >
> > What we should do instead is CALL the proper function directly. (This
> > was something Ingo suggested a while ago). So let's do that.
>
> looks like this one broke the kexec.
> after revert it back, kexec work again.

Ok, this should be fixed in the new version I just pushed out:

f25d38475519 x86/asm: Optimize clear_page()

Please let me know if it doesn't.

Thanks,

Ingo

2017-03-07 20:08:58

by Yinghai Lu

[permalink] [raw]
Subject: Re: [tip:x86/asm] x86/asm: Optimize clear_page()

On Mon, Mar 6, 2017 at 11:30 PM, Ingo Molnar <[email protected]> wrote:
>
> * Yinghai Lu <[email protected]> wrote:
>
>> On Wed, Mar 1, 2017 at 1:47 AM, tip-bot for Borislav Petkov
>> <[email protected]> wrote:
>> > Commit-ID: 49ca7bb328c630dd43be626534b49e19513296fd
>> > Gitweb: http://git.kernel.org/tip/49ca7bb328c630dd43be626534b49e19513296fd
>> > Author: Borislav Petkov <[email protected]>
>> > AuthorDate: Thu, 9 Feb 2017 01:34:49 +0100
>> > Committer: Ingo Molnar <[email protected]>
>> > CommitDate: Wed, 1 Mar 2017 10:18:32 +0100
>> >
>> > x86/asm: Optimize clear_page()
>> >
>> > Currently, we CALL clear_page() which then JMPs to the proper function
>> > chosen by the alternatives.
>> >
>> > What we should do instead is CALL the proper function directly. (This
>> > was something Ingo suggested a while ago). So let's do that.
>>
>> looks like this one broke the kexec.
>> after revert it back, kexec work again.
>
> Ok, this should be fixed in the new version I just pushed out:
>
> f25d38475519 x86/asm: Optimize clear_page()
>
> Please let me know if it doesn't.

Yes. new commit works with kexec.

Thanks

Yinghai

2017-03-08 11:57:17

by Ingo Molnar

[permalink] [raw]
Subject: Re: [tip:x86/asm] x86/asm: Optimize clear_page()


* Yinghai Lu <[email protected]> wrote:

> On Mon, Mar 6, 2017 at 11:30 PM, Ingo Molnar <[email protected]> wrote:
> >
> > * Yinghai Lu <[email protected]> wrote:
> >
> >> On Wed, Mar 1, 2017 at 1:47 AM, tip-bot for Borislav Petkov
> >> <[email protected]> wrote:
> >> > Commit-ID: 49ca7bb328c630dd43be626534b49e19513296fd
> >> > Gitweb: http://git.kernel.org/tip/49ca7bb328c630dd43be626534b49e19513296fd
> >> > Author: Borislav Petkov <[email protected]>
> >> > AuthorDate: Thu, 9 Feb 2017 01:34:49 +0100
> >> > Committer: Ingo Molnar <[email protected]>
> >> > CommitDate: Wed, 1 Mar 2017 10:18:32 +0100
> >> >
> >> > x86/asm: Optimize clear_page()
> >> >
> >> > Currently, we CALL clear_page() which then JMPs to the proper function
> >> > chosen by the alternatives.
> >> >
> >> > What we should do instead is CALL the proper function directly. (This
> >> > was something Ingo suggested a while ago). So let's do that.
> >>
> >> looks like this one broke the kexec.
> >> after revert it back, kexec work again.
> >
> > Ok, this should be fixed in the new version I just pushed out:
> >
> > f25d38475519 x86/asm: Optimize clear_page()
> >
> > Please let me know if it doesn't.
>
> Yes. new commit works with kexec.
>
> Thanks

Thanks for testing!

Ingo