Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756242AbaDVNmJ (ORCPT ); Tue, 22 Apr 2014 09:42:09 -0400 Received: from cam-admin0.cambridge.arm.com ([217.140.96.50]:46457 "EHLO cam-admin0.cambridge.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752807AbaDVNmD (ORCPT ); Tue, 22 Apr 2014 09:42:03 -0400 Date: Tue, 22 Apr 2014 14:42:00 +0100 From: Mark Rutland To: Jean Pihet Cc: "linux-kernel@vger.kernel.org" , "linaro-kernel@lists.linaro.org" , "linux-arm-kernel@lists.infradead.org" , Arnaldo , Ingo Molnar , Jiri Olsa , Steve Capper , "patches@linaro.org" , Corey Ashford , Frederic Weisbecker , Namhyung Kim , Paul Mackerras , Peter Zijlstra , David Ahern Subject: Re: [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM64 Message-ID: <20140422134159.GC9625@e106331-lin.cambridge.arm.com> References: <1395222155-22205-1-git-send-email-jean.pihet@linaro.org> <1395222155-22205-2-git-send-email-jean.pihet@linaro.org> <20140321151126.GR23372@e106331-lin.cambridge.arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Thread-Topic: [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM64 Accept-Language: en-GB, en-US Content-Language: en-US acceptlanguage: en-GB, en-US User-Agent: Mutt/1.5.20 (2009-06-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Jean, Apologies for the delay on this. On Tue, Mar 25, 2014 at 03:23:26PM +0000, Jean Pihet wrote: > Hi Mark, > > On 21 March 2014 16:11, Mark Rutland wrote: > > Hi Jean, > > > > On Wed, Mar 19, 2014 at 09:42:33AM +0000, Jean Pihet wrote: > >> Introducing perf_regs_load function, which is going > >> to be used for dwarf unwind test in following patches. > >> > >> It takes single argument as a pointer to the regs dump > >> buffer and populates it with current registers values, as > >> expected by the perf built-in unwinding test. > >> > >> Signed-off-by: Jean Pihet > >> Cc: Steve Capper > >> Cc: Corey Ashford > >> Cc: Frederic Weisbecker > >> Cc: Ingo Molnar > >> Cc: Namhyung Kim > >> Cc: Paul Mackerras > >> Cc: Peter Zijlstra > >> Cc: Arnaldo Carvalho de Melo > >> Cc: David Ahern > >> Cc: Jiri Olsa > >> --- > >> tools/perf/arch/arm64/Makefile | 1 + > >> tools/perf/arch/arm64/include/perf_regs.h | 2 ++ > >> tools/perf/arch/arm64/tests/regs_load.S | 39 +++++++++++++++++++++++++++++++ > >> 3 files changed, 42 insertions(+) > >> create mode 100644 tools/perf/arch/arm64/tests/regs_load.S > >> > >> diff --git a/tools/perf/arch/arm64/Makefile b/tools/perf/arch/arm64/Makefile > >> index 67e9b3d..9b8f87e 100644 > >> --- a/tools/perf/arch/arm64/Makefile > >> +++ b/tools/perf/arch/arm64/Makefile > >> @@ -4,4 +4,5 @@ LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o > >> endif > >> ifndef NO_LIBUNWIND > >> LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libunwind.o > >> +LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/regs_load.o > >> endif > >> diff --git a/tools/perf/arch/arm64/include/perf_regs.h b/tools/perf/arch/arm64/include/perf_regs.h > >> index 2359546..1e052f1 100644 > >> --- a/tools/perf/arch/arm64/include/perf_regs.h > >> +++ b/tools/perf/arch/arm64/include/perf_regs.h > >> @@ -9,6 +9,8 @@ > >> #define PERF_REG_IP PERF_REG_ARM64_PC > >> #define PERF_REG_SP PERF_REG_ARM64_SP > >> > >> +void perf_regs_load(u64 *regs); > >> + > >> static inline const char *perf_reg_name(int id) > >> { > >> switch (id) { > >> diff --git a/tools/perf/arch/arm64/tests/regs_load.S b/tools/perf/arch/arm64/tests/regs_load.S > >> new file mode 100644 > >> index 0000000..92ab968 > >> --- /dev/null > >> +++ b/tools/perf/arch/arm64/tests/regs_load.S > >> @@ -0,0 +1,39 @@ > >> +#include > >> + > >> +/* > >> + * Implementation of void perf_regs_load(u64 *regs); > >> + * > >> + * This functions fills in the 'regs' buffer from the actual registers values, > >> + * in the way the perf built-in unwinding test expects them: > >> + * - the PC at the time at the call to this function. Since this function > >> + * is called using a bl instruction, the PC value is taken from LR, > > > > Is it guaranteed that this function is always invoked with a branch with > > link instruction, or is that just what current compiler versions are > > doing? I couldn't see where we would get that guarantee from. > The current compiler implements the call as a bl instruction. While I don't think we can rely on the compiler using a bl to call the function it shouldn't matter here if we only care about the LR value being an address within the caller, as it doesn't look amenable to tail call optimization. > > If it is called with a branch with link, then the LR value will be the > > PC at call time + 4, rather than just the exact PC at call time. If not > > then we don't have a guaranteed relationship between the PC at call time > > and the current LR value. > > > > If the only place that perf_regs_load is used is a single test which > > doesn't care about the precise PC at the time of the call, then it's > > probably OK to use the LR value, but we should be careful to document > > what the faked-up PC actually is and how we expect it to be used. > The code is only used by an unwinding test. The unwinding code > resolves the function name from an address range found in the dwarf > information so in principle it is ok to use the PC/LR at the time of > the call to a function. > > Is the comment above OK or do you want an update of the code as well? If we just need an (arbitrary) address within the caller, a comment update should be fine. > >> + * - the current SP (not touched by this function), > >> + * - the current value of LR is merely retrieved and stored because the > >> + * value before the call to this function is unknown at this time; it will > >> + * be unwound from the dwarf information in unwind__get_entries. > >> + */ > >> + > >> +.text > >> +.type perf_regs_load,%function > >> +ENTRY(perf_regs_load) > >> + stp x0, x1, [x0], #16 // store x0..x29 > >> + stp x2, x3, [x0], #16 > >> + stp x4, x5, [x0], #16 > >> + stp x6, x7, [x0], #16 > >> + stp x8, x9, [x0], #16 > >> + stp x10, x11, [x0], #16 > >> + stp x12, x13, [x0], #16 > >> + stp x14, x15, [x0], #16 > >> + stp x16, x17, [x0], #16 > >> + stp x18, x19, [x0], #16 > >> + stp x20, x21, [x0], #16 > >> + stp x22, x23, [x0], #16 > >> + stp x24, x25, [x0], #16 > >> + stp x26, x27, [x0], #16 > >> + stp x28, x29, [x0], #16 > >> + mov x1, sp > >> + stp x30, x1, [x0], #16 // store lr and sp > >> + str x30, [x0] // store pc as lr in order to skip the call > >> + // to this function > > > > It might be better to word this a "store the lr in place of the pc". To > > me at least the current wording implies the opposite of what the code > > seems to be doing. > Ok the last comment can be updated. Ok, cheers. With those changes I think this looks fine. Thanks, Mark. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/