Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756127AbZCFV5n (ORCPT ); Fri, 6 Mar 2009 16:57:43 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753502AbZCFV5d (ORCPT ); Fri, 6 Mar 2009 16:57:33 -0500 Received: from mx2.redhat.com ([66.187.237.31]:48783 "EHLO mx2.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751064AbZCFV5d (ORCPT ); Fri, 6 Mar 2009 16:57:33 -0500 Message-ID: <49B19C5B.30805@redhat.com> Date: Fri, 06 Mar 2009 16:57:47 -0500 From: Masami Hiramatsu User-Agent: Thunderbird 2.0.0.19 (X11/20090105) MIME-Version: 1.0 To: Mathieu Desnoyers CC: Ingo Molnar , Andrew Morton , Nick Piggin , Steven Rostedt , Andi Kleen , Linux Kernel Mailing List , Thomas Gleixner , Peter Zijlstra , Frederic Weisbecker , Linus Torvalds , Arjan van de Ven , Rusty Russell , "H. Peter Anvin" Subject: Re: [PATCH -tip 5/4] Expands irq-off region in text_poke() References: <49B1428A.9050500@redhat.com> <49B14352.2040705@redhat.com> <20090306181356.GD14236@Krystal> <49B16C69.6060203@redhat.com> <20090306190828.GA28582@elte.hu> <20090306191517.GG14236@Krystal> <20090306192246.GF28582@elte.hu> <49B186A9.3030506@redhat.com> <20090306210116.GB20603@Krystal> In-Reply-To: <20090306210116.GB20603@Krystal> X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5073 Lines: 128 Mathieu Desnoyers wrote: > * Masami Hiramatsu (mhiramat@redhat.com) wrote: >> Ingo Molnar wrote: >>> * Mathieu Desnoyers wrote: >>> >>>> * Ingo Molnar (mingo@elte.hu) wrote: >>>>> * Masami Hiramatsu wrote: >>>>> >>>>>> @@ -523,14 +526,17 @@ void *__kprobes text_poke(void *addr, co >>>>>> pages[1] = virt_to_page(addr + PAGE_SIZE); >>>>>> } >>>>>> BUG_ON(!pages[0]); >>>>>> - if (!pages[1]) >>>>>> - nr_pages = 1; >>>>>> - vaddr = vmap(pages, nr_pages, VM_MAP, PAGE_KERNEL); >>>>>> - BUG_ON(!vaddr); >>>>>> - local_irq_disable(); >>>>>> + local_irq_save(flags); >>>>>> + set_fixmap(FIX_TEXT_POKE0, page_to_phys(pages[0])); >>>>>> + if (pages[1]) >>>>>> + set_fixmap(FIX_TEXT_POKE1, page_to_phys(pages[1])); >>>>>> + vaddr = (char *)fix_to_virt(FIX_TEXT_POKE0); >>>>>> memcpy(&vaddr[(unsigned long)addr & ~PAGE_MASK], opcode, len); >>>>>> - local_irq_enable(); >>>>>> - vunmap(vaddr); >>>>>> + clear_fixmap(FIX_TEXT_POKE0); >>>>>> + if (pages[1]) >>>>>> + clear_fixmap(FIX_TEXT_POKE1); >>>>>> + local_flush_tlb(); >>>>>> + local_irq_restore(flags); >>>>>> sync_core(); >>>>> I'm not sure at all about this widening of the irq-atomic >>>>> section and the idea of allowing non-locked access on single-CPU >>>>> situations - we dont really want to micro-optimize any of this >>>>> on such a level, holding the text lock is a robust rule all code >>>>> should be listening to. (Creating locking assymetry always >>>>> inserts a certain amount of fragility - adding to an already >>>>> fragile concept here.) >>>>> >>>>> And note that there's no reason why text_poke could not be used >>>>> in stop_machine_run() - the stop_machine_run() handler must not >>>>> take the text_lock of course - but outside code calling >>>>> stop_machine_run() can do it and can hence serialize properly. >>>>> >>>>> Note that even if we did this then your v2 patch is not fully >>>>> correct: you need to move the sync_core() at the end of the >>>>> sequence inside the critical section too. (right now this is >>>>> mostly harmless because the INVLPG inside the clear_fixmap() >>>>> happens to be serializing so it has an implicit sync_core() >>>>> property - but nevertheless we better do this straight away to >>>>> not cause problems later down the line.) >>>>> >>>>> Ingo >>>> Agreed. The alternatives_smp_lock/alternatives_smp_unlock >>>> specific case does not bring us much if it has no perceivable >>>> performance impact. It's better to keep a standard interface >>>> and clear requirements. >>> Note that i dont object to another aspect of this same change: >>> the fact that it makes the whole sequence more atomic and more >>> defensive [which is never bad of fragile interfaces]. >>> >>> I only got worried about the "lets use this without the text >>> lock" ideas. >>> >>> So if Masami-san sends a delta patch with a different changelog >>> and with the sync_core() bit moved inside the critical section, >>> i'll apply that too. >> OK, here is the delta patch. >> >> Expand irq-atomic region to cover fixmap using code and sync_core. >> >> Signed-off-by: Masami Hiramatsu >> Cc: Mathieu Desnoyers >> Cc: Ingo Molnar >> --- >> arch/x86/kernel/alternative.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> Index: linux-2.6-tip/arch/x86/kernel/alternative.c >> =================================================================== >> --- linux-2.6-tip.orig/arch/x86/kernel/alternative.c >> +++ linux-2.6-tip/arch/x86/kernel/alternative.c >> @@ -526,13 +526,12 @@ void *__kprobes text_poke(void *addr, co >> pages[1] = virt_to_page(addr + PAGE_SIZE); >> } >> BUG_ON(!pages[0]); >> + local_irq_save(flags); >> set_fixmap(FIX_TEXT_POKE0, page_to_phys(pages[0])); >> if (pages[1]) >> set_fixmap(FIX_TEXT_POKE1, page_to_phys(pages[1])); >> vaddr = (char *)fix_to_virt(FIX_TEXT_POKE0); >> - local_irq_save(flags); >> memcpy(&vaddr[(unsigned long)addr & ~PAGE_MASK], opcode, len); >> - local_irq_restore(flags); >> clear_fixmap(FIX_TEXT_POKE0); >> if (pages[1]) >> clear_fixmap(FIX_TEXT_POKE1); >> @@ -540,6 +539,7 @@ void *__kprobes text_poke(void *addr, co >> sync_core(); >> /* Could also do a CLFLUSH here to speed up CPU recovery; but >> that causes hangs on some VIA CPUs. */ >> + local_irq_restore(flags); >> for (i = 0; i < len; i++) >> BUG_ON(((char *)addr)[i] != ((char *)opcode)[i]); > > I think irq off should cover the BUG_ON too. This safety check assumes > we are the only ones modifying "addr". I think others don't change without text_mutex, don't it? Thank you, -- Masami Hiramatsu Software Engineer Hitachi Computer Products (America) Inc. Software Solutions Division e-mail: mhiramat@redhat.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/