Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757569AbcLBSL5 (ORCPT ); Fri, 2 Dec 2016 13:11:57 -0500 Received: from p4FC0A62B.dip0.t-ipconnect.de ([79.192.166.43]:33458 "EHLO pd.tnic" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751561AbcLBSLm (ORCPT ); Fri, 2 Dec 2016 13:11:42 -0500 X-Greylist: delayed 452 seconds by postgrey-1.27 at vger.kernel.org; Fri, 02 Dec 2016 13:11:42 EST Date: Fri, 2 Dec 2016 19:03:43 +0100 From: Borislav Petkov To: Andy Lutomirski Cc: Linus Torvalds , Peter Anvin , the arch/x86 maintainers , One Thousand Gnomes , Borislav Petkov , "linux-kernel@vger.kernel.org" , Brian Gerst , Matthew Whitehead , Henrique de Moraes Holschuh , Peter Zijlstra , Andrew Cooper Subject: Re: [PATCH v2 5/6] x86/xen: Add a Xen-specific sync_core() implementation Message-ID: <20161202180343.gehqor7lgtmzwqq3@pd.tnic> References: <0a21157c2233ba7d0781bbf07866b3f2d7e7c25d.1480638597.git.luto@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20161014 (1.7.1) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2666 Lines: 86 On Fri, Dec 02, 2016 at 09:38:38AM -0800, Andy Lutomirski wrote: > apply_alternatives, unfortunately. It's performance-critical because > it's intensely stupid and does sync_core() for every single patch. > Fixing that would be nice, too. So I did experiment at the time to batch that sync_core() and interrupts disabling in text_poke_early() but that amounted to almost nothing. That's why I didn't bother chasing this further. I can try to dig out that old thread... /me goes and searches... ah, here's something: SNB: * before: [ 0.011707] SMP alternatives: Starting apply_alternatives... [ 0.011784] SMP alternatives: done. [ 0.011806] Freeing SMP alternatives: 20k freed * after: [ 0.011699] SMP alternatives: Starting apply_alternatives... [ 0.011721] SMP alternatives: done. [ 0.011743] Freeing SMP alternatives: 20k freed -> 63 microseconds speedup * kvm guest: * before: [ 0.017005] SMP alternatives: Starting apply_alternatives... [ 0.019024] SMP alternatives: done. [ 0.020119] Freeing SMP alternatives: 20k freed * after: [ 0.015008] SMP alternatives: Starting apply_alternatives... [ 0.016029] SMP alternatives: done. [ 0.017118] Freeing SMP alternatives: 20k freed -> ~3 milliseconds speedup --- I tried something simple like this: --- diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index 1850592f4700..f4d5689ea503 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -253,10 +253,13 @@ void __init_or_module apply_alternatives(struct alt_instr *start, struct alt_instr *end) { struct alt_instr *a; + unsigned long flags; u8 *instr, *replacement; u8 insnbuf[MAX_PATCH_LEN]; DPRINTK("%s: alt table %p -> %p\n", __func__, start, end); + + local_irq_save(flags); /* * The scan order should be from start to end. A later scanned * alternative code can overwrite a previous scanned alternative code. @@ -284,8 +287,10 @@ void __init_or_module apply_alternatives(struct alt_instr *start, add_nops(insnbuf + a->replacementlen, a->instrlen - a->replacementlen); - text_poke_early(instr, insnbuf, a->instrlen); + memcpy(instr, insnbuf, a->instrlen); } + sync_core(); + local_irq_restore(flags); } #ifdef CONFIG_SMP --- I could try and redo the measurements again but I doubt it'll be any different. Unless I haven't made a mistake somewhere... -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply.