Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755634AbZCGBmj (ORCPT ); Fri, 6 Mar 2009 20:42:39 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753480AbZCGBma (ORCPT ); Fri, 6 Mar 2009 20:42:30 -0500 Received: from tomts5.bellnexxia.net ([209.226.175.25]:32815 "EHLO tomts5-srv.bellnexxia.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753338AbZCGBma (ORCPT ); Fri, 6 Mar 2009 20:42:30 -0500 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AsAEACNdsUlMQW1W/2dsb2JhbACBTtNNhAUG Date: Fri, 6 Mar 2009 20:42:20 -0500 From: Mathieu Desnoyers To: Masami Hiramatsu Cc: Ingo Molnar , Andrew Morton , Nick Piggin , Steven Rostedt , Andi Kleen , Linux Kernel Mailing List , Thomas Gleixner , Peter Zijlstra , Frederic Weisbecker , Linus Torvalds , Arjan van de Ven , Rusty Russell , "H. Peter Anvin" Subject: Re: [PATCH -tip 5/4] Expands irq-off region in text_poke() Message-ID: <20090307014219.GA27154@Krystal> References: <49B1428A.9050500@redhat.com> <49B14352.2040705@redhat.com> <20090306181356.GD14236@Krystal> <49B16C69.6060203@redhat.com> <20090306190828.GA28582@elte.hu> <20090306191517.GG14236@Krystal> <20090306192246.GF28582@elte.hu> <49B186A9.3030506@redhat.com> <20090306210116.GB20603@Krystal> <49B19C5B.30805@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline In-Reply-To: <49B19C5B.30805@redhat.com> X-Editor: vi X-Info: http://krystal.dyndns.org:8080 X-Operating-System: Linux/2.6.21.3-grsec (i686) X-Uptime: 20:41:18 up 6 days, 22:07, 1 user, load average: 0.81, 0.82, 0.73 User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5702 Lines: 141 * Masami Hiramatsu (mhiramat@redhat.com) wrote: > > > Mathieu Desnoyers wrote: > > * Masami Hiramatsu (mhiramat@redhat.com) wrote: > >> Ingo Molnar wrote: > >>> * Mathieu Desnoyers wrote: > >>> > >>>> * Ingo Molnar (mingo@elte.hu) wrote: > >>>>> * Masami Hiramatsu wrote: > >>>>> > >>>>>> @@ -523,14 +526,17 @@ void *__kprobes text_poke(void *addr, co > >>>>>> pages[1] = virt_to_page(addr + PAGE_SIZE); > >>>>>> } > >>>>>> BUG_ON(!pages[0]); > >>>>>> - if (!pages[1]) > >>>>>> - nr_pages = 1; > >>>>>> - vaddr = vmap(pages, nr_pages, VM_MAP, PAGE_KERNEL); > >>>>>> - BUG_ON(!vaddr); > >>>>>> - local_irq_disable(); > >>>>>> + local_irq_save(flags); > >>>>>> + set_fixmap(FIX_TEXT_POKE0, page_to_phys(pages[0])); > >>>>>> + if (pages[1]) > >>>>>> + set_fixmap(FIX_TEXT_POKE1, page_to_phys(pages[1])); > >>>>>> + vaddr = (char *)fix_to_virt(FIX_TEXT_POKE0); > >>>>>> memcpy(&vaddr[(unsigned long)addr & ~PAGE_MASK], opcode, len); > >>>>>> - local_irq_enable(); > >>>>>> - vunmap(vaddr); > >>>>>> + clear_fixmap(FIX_TEXT_POKE0); > >>>>>> + if (pages[1]) > >>>>>> + clear_fixmap(FIX_TEXT_POKE1); > >>>>>> + local_flush_tlb(); > >>>>>> + local_irq_restore(flags); > >>>>>> sync_core(); > >>>>> I'm not sure at all about this widening of the irq-atomic > >>>>> section and the idea of allowing non-locked access on single-CPU > >>>>> situations - we dont really want to micro-optimize any of this > >>>>> on such a level, holding the text lock is a robust rule all code > >>>>> should be listening to. (Creating locking assymetry always > >>>>> inserts a certain amount of fragility - adding to an already > >>>>> fragile concept here.) > >>>>> > >>>>> And note that there's no reason why text_poke could not be used > >>>>> in stop_machine_run() - the stop_machine_run() handler must not > >>>>> take the text_lock of course - but outside code calling > >>>>> stop_machine_run() can do it and can hence serialize properly. > >>>>> > >>>>> Note that even if we did this then your v2 patch is not fully > >>>>> correct: you need to move the sync_core() at the end of the > >>>>> sequence inside the critical section too. (right now this is > >>>>> mostly harmless because the INVLPG inside the clear_fixmap() > >>>>> happens to be serializing so it has an implicit sync_core() > >>>>> property - but nevertheless we better do this straight away to > >>>>> not cause problems later down the line.) > >>>>> > >>>>> Ingo > >>>> Agreed. The alternatives_smp_lock/alternatives_smp_unlock > >>>> specific case does not bring us much if it has no perceivable > >>>> performance impact. It's better to keep a standard interface > >>>> and clear requirements. > >>> Note that i dont object to another aspect of this same change: > >>> the fact that it makes the whole sequence more atomic and more > >>> defensive [which is never bad of fragile interfaces]. > >>> > >>> I only got worried about the "lets use this without the text > >>> lock" ideas. > >>> > >>> So if Masami-san sends a delta patch with a different changelog > >>> and with the sync_core() bit moved inside the critical section, > >>> i'll apply that too. > >> OK, here is the delta patch. > >> > >> Expand irq-atomic region to cover fixmap using code and sync_core. > >> > >> Signed-off-by: Masami Hiramatsu > >> Cc: Mathieu Desnoyers > >> Cc: Ingo Molnar > >> --- > >> arch/x86/kernel/alternative.c | 4 ++-- > >> 1 file changed, 2 insertions(+), 2 deletions(-) > >> > >> Index: linux-2.6-tip/arch/x86/kernel/alternative.c > >> =================================================================== > >> --- linux-2.6-tip.orig/arch/x86/kernel/alternative.c > >> +++ linux-2.6-tip/arch/x86/kernel/alternative.c > >> @@ -526,13 +526,12 @@ void *__kprobes text_poke(void *addr, co > >> pages[1] = virt_to_page(addr + PAGE_SIZE); > >> } > >> BUG_ON(!pages[0]); > >> + local_irq_save(flags); > >> set_fixmap(FIX_TEXT_POKE0, page_to_phys(pages[0])); > >> if (pages[1]) > >> set_fixmap(FIX_TEXT_POKE1, page_to_phys(pages[1])); > >> vaddr = (char *)fix_to_virt(FIX_TEXT_POKE0); > >> - local_irq_save(flags); > >> memcpy(&vaddr[(unsigned long)addr & ~PAGE_MASK], opcode, len); > >> - local_irq_restore(flags); > >> clear_fixmap(FIX_TEXT_POKE0); > >> if (pages[1]) > >> clear_fixmap(FIX_TEXT_POKE1); > >> @@ -540,6 +539,7 @@ void *__kprobes text_poke(void *addr, co > >> sync_core(); > >> /* Could also do a CLFLUSH here to speed up CPU recovery; but > >> that causes hangs on some VIA CPUs. */ > >> + local_irq_restore(flags); > >> for (i = 0; i < len; i++) > >> BUG_ON(((char *)addr)[i] != ((char *)opcode)[i]); > > > > I think irq off should cover the BUG_ON too. This safety check assumes > > we are the only ones modifying "addr". > > I think others don't change without text_mutex, don't it? > They shouldn't, but given we decided to grow the irq off region to contain all the code that needs to be executed atomically, it should also contain the BUG_ON, because it is expected to be as atomic as the rest of the code. Mathieu > Thank you, > > -- > Masami Hiramatsu > > Software Engineer > Hitachi Computer Products (America) Inc. > Software Solutions Division > > e-mail: mhiramat@redhat.com > -- Mathieu Desnoyers OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/