2022-05-05 18:19:00

by Ali Saidi

[permalink] [raw]
Subject: [PATCH v8 2/5] perf tools: sync addition of PERF_MEM_SNOOPX_PEER

Add a flag to the perf mem data struct to signal that a request caused a
cache-to-cache transfer of a line from a peer of the requestor and
wasn't sourced from a lower cache level. The line being moved from one
peer cache to another has latency and performance implications. On Arm64
Neoverse systems the data source can indicate a cache-to-cache transfer
but not if the line is dirty or clean, so instead of overloading HITM
define a new flag that indicates this type of transfer.

Signed-off-by: Ali Saidi <[email protected]>
Reviewed-by: Leo Yan <[email protected]>
---
tools/include/uapi/linux/perf_event.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index d37629dbad72..7b88bfd097dc 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -1310,7 +1310,7 @@ union perf_mem_data_src {
#define PERF_MEM_SNOOP_SHIFT 19

#define PERF_MEM_SNOOPX_FWD 0x01 /* forward */
-/* 1 free */
+#define PERF_MEM_SNOOPX_PEER 0x02 /* xfer from peer */
#define PERF_MEM_SNOOPX_SHIFT 38

/* locked instruction */
--
2.32.0



2022-05-10 20:30:51

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v8 2/5] perf tools: sync addition of PERF_MEM_SNOOPX_PEER

Em Wed, May 04, 2022 at 06:48:47PM +0000, Ali Saidi escreveu:
> Add a flag to the perf mem data struct to signal that a request caused a
> cache-to-cache transfer of a line from a peer of the requestor and
> wasn't sourced from a lower cache level. The line being moved from one
> peer cache to another has latency and performance implications. On Arm64
> Neoverse systems the data source can indicate a cache-to-cache transfer
> but not if the line is dirty or clean, so instead of overloading HITM
> define a new flag that indicates this type of transfer.
>
> Signed-off-by: Ali Saidi <[email protected]>
> Reviewed-by: Leo Yan <[email protected]>

Was this already merged on the ARM kernel tree?

- Arnaldo

> ---
> tools/include/uapi/linux/perf_event.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
> index d37629dbad72..7b88bfd097dc 100644
> --- a/tools/include/uapi/linux/perf_event.h
> +++ b/tools/include/uapi/linux/perf_event.h
> @@ -1310,7 +1310,7 @@ union perf_mem_data_src {
> #define PERF_MEM_SNOOP_SHIFT 19
>
> #define PERF_MEM_SNOOPX_FWD 0x01 /* forward */
> -/* 1 free */
> +#define PERF_MEM_SNOOPX_PEER 0x02 /* xfer from peer */
> #define PERF_MEM_SNOOPX_SHIFT 38
>
> /* locked instruction */
> --
> 2.32.0

--

- Arnaldo

2022-05-11 04:42:24

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH v8 2/5] perf tools: sync addition of PERF_MEM_SNOOPX_PEER

On Tue, May 10, 2022 at 01:28:38PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Wed, May 04, 2022 at 06:48:47PM +0000, Ali Saidi escreveu:
> > Add a flag to the perf mem data struct to signal that a request caused a
> > cache-to-cache transfer of a line from a peer of the requestor and
> > wasn't sourced from a lower cache level. The line being moved from one
> > peer cache to another has latency and performance implications. On Arm64
> > Neoverse systems the data source can indicate a cache-to-cache transfer
> > but not if the line is dirty or clean, so instead of overloading HITM
> > define a new flag that indicates this type of transfer.
> >
> > Signed-off-by: Ali Saidi <[email protected]>
> > Reviewed-by: Leo Yan <[email protected]>
>
> Was this already merged on the ARM kernel tree?

No, I don't think this patch has been merged on Arm kernel tree. I searched
Arm and Arm64 git repos, none of them has merged this patch.

https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?qt=author&q=Ali+Saidi
http://git.armlinux.org.uk/cgit/linux-arm.git/log/?qt=author&q=Ali+Saidi

P.s. Ali missed to include German's review tag, see:
https://lore.kernel.org/lkml/[email protected]/

Do you want us to resend the patch set for adding tags?

Thanks,
Leo

2022-05-11 09:48:39

by kajoljain

[permalink] [raw]
Subject: Re: [PATCH v8 2/5] perf tools: sync addition of PERF_MEM_SNOOPX_PEER



On 5/5/22 00:18, Ali Saidi wrote:
> Add a flag to the perf mem data struct to signal that a request caused a
> cache-to-cache transfer of a line from a peer of the requestor and
> wasn't sourced from a lower cache level. The line being moved from one
> peer cache to another has latency and performance implications. On Arm64
> Neoverse systems the data source can indicate a cache-to-cache transfer
> but not if the line is dirty or clean, so instead of overloading HITM
> define a new flag that indicates this type of transfer.
>
> Signed-off-by: Ali Saidi <[email protected]>
> Reviewed-by: Leo Yan <[email protected]>
> ---
> tools/include/uapi/linux/perf_event.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
> index d37629dbad72..7b88bfd097dc 100644
> --- a/tools/include/uapi/linux/perf_event.h
> +++ b/tools/include/uapi/linux/perf_event.h
> @@ -1310,7 +1310,7 @@ union perf_mem_data_src {
> #define PERF_MEM_SNOOP_SHIFT 19
>
> #define PERF_MEM_SNOOPX_FWD 0x01 /* forward */
> -/* 1 free */
> +#define PERF_MEM_SNOOPX_PEER 0x02 /* xfer from peer */
> #define PERF_MEM_SNOOPX_SHIFT 38
>
> /* locked instruction */
Patch looks good to me.

Reviewed-By: Kajol Jain<[email protected]>

Thanks,
Kajol Jain

2022-05-12 06:28:03

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v8 2/5] perf tools: sync addition of PERF_MEM_SNOOPX_PEER

Em Wed, May 11, 2022 at 10:20:04AM +0800, Leo Yan escreveu:
> On Tue, May 10, 2022 at 01:28:38PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Wed, May 04, 2022 at 06:48:47PM +0000, Ali Saidi escreveu:
> > > Add a flag to the perf mem data struct to signal that a request caused a
> > > cache-to-cache transfer of a line from a peer of the requestor and
> > > wasn't sourced from a lower cache level. The line being moved from one
> > > peer cache to another has latency and performance implications. On Arm64
> > > Neoverse systems the data source can indicate a cache-to-cache transfer
> > > but not if the line is dirty or clean, so instead of overloading HITM
> > > define a new flag that indicates this type of transfer.
> > >
> > > Signed-off-by: Ali Saidi <[email protected]>
> > > Reviewed-by: Leo Yan <[email protected]>
> >
> > Was this already merged on the ARM kernel tree?
>
> No, I don't think this patch has been merged on Arm kernel tree. I searched
> Arm and Arm64 git repos, none of them has merged this patch.
>
> https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?qt=author&q=Ali+Saidi
> http://git.armlinux.org.uk/cgit/linux-arm.git/log/?qt=author&q=Ali+Saidi
>
> P.s. Ali missed to include German's review tag, see:
> https://lore.kernel.org/lkml/[email protected]/
>
> Do you want us to resend the patch set for adding tags?

I use b4 and it should collect Reviewed-by, Acked-by, etc tags, for
instance, if I use the message-id in your message:

⬢[acme@toolbox perf]$ b4 am -ctsl --cc-trailers 20220511022004.GA956170@leoy-ThinkPad-X240s
Looking up https://lore.kernel.org/r/20220511022004.GA956170%40leoy-ThinkPad-X240s
Grabbing thread from lore.kernel.org/all/20220511022004.GA956170%40leoy-ThinkPad-X240s/t.mbox.gz
Checking for newer revisions on https://lore.kernel.org/all/
Analyzing 12 messages in the thread
Checking attestation on all messages, may take a moment...
---
✓ [PATCH v8 1/5] perf: Add SNOOP_PEER flag to perf mem data struct
+ Reviewed-By: Kajol Jain<[email protected]> (✓ DKIM/ibm.com)
+ Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
+ Link: https://lore.kernel.org/r/[email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
✓ [PATCH v8 2/5] perf tools: sync addition of PERF_MEM_SNOOPX_PEER
+ Reviewed-By: Kajol Jain<[email protected]> (✓ DKIM/ibm.com)
+ Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
+ Link: https://lore.kernel.org/r/[email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
✓ [PATCH v8 3/5] perf mem: Print snoop peer flag
+ Reviewed-By: Kajol Jain<[email protected]> (✓ DKIM/ibm.com)
+ Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
+ Link: https://lore.kernel.org/r/[email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
✓ [PATCH v8 4/5] perf arm-spe: Don't set data source if it's not a memory operation
+ Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
+ Link: https://lore.kernel.org/r/[email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
✓ [PATCH v8 5/5] perf arm-spe: Use SPE data source for neoverse cores
+ Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
+ Link: https://lore.kernel.org/r/[email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
+ Cc: [email protected]
---
✓ Signed: DKIM/amazon.com
---
Total patches: 5
---
Cover: ./v8_20220504_alisaidi_perf_arm_spe_decode_spe_source_and_use_for_perf_c2c.cover
Link: https://lore.kernel.org/r/[email protected]
Base: not specified
git am ./v8_20220504_alisaidi_perf_arm_spe_decode_spe_source_and_use_for_perf_c2c.mbx
⬢[acme@toolbox perf]$

Somehow it is not being collected... :-\

Not even when I use:

> P.s. Ali missed to include German's review tag, see:
> https://lore.kernel.org/lkml/[email protected]/


[email protected]

Will try updating b4...

- Arnaldo

2022-05-12 08:42:22

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v8 2/5] perf tools: sync addition of PERF_MEM_SNOOPX_PEER

Em Wed, May 11, 2022 at 03:28:00PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Wed, May 11, 2022 at 10:20:04AM +0800, Leo Yan escreveu:
> > On Tue, May 10, 2022 at 01:28:38PM -0300, Arnaldo Carvalho de Melo wrote:
> > > Em Wed, May 04, 2022 at 06:48:47PM +0000, Ali Saidi escreveu:
> > > > Add a flag to the perf mem data struct to signal that a request caused a
> > > > cache-to-cache transfer of a line from a peer of the requestor and
> > > > wasn't sourced from a lower cache level. The line being moved from one
> > > > peer cache to another has latency and performance implications. On Arm64
> > > > Neoverse systems the data source can indicate a cache-to-cache transfer
> > > > but not if the line is dirty or clean, so instead of overloading HITM
> > > > define a new flag that indicates this type of transfer.
> > > >
> > > > Signed-off-by: Ali Saidi <[email protected]>
> > > > Reviewed-by: Leo Yan <[email protected]>
> > >
> > > Was this already merged on the ARM kernel tree?
> >
> > No, I don't think this patch has been merged on Arm kernel tree. I searched
> > Arm and Arm64 git repos, none of them has merged this patch.
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?qt=author&q=Ali+Saidi
> > http://git.armlinux.org.uk/cgit/linux-arm.git/log/?qt=author&q=Ali+Saidi
> >
> > P.s. Ali missed to include German's review tag, see:
> > https://lore.kernel.org/lkml/[email protected]/
> >
> > Do you want us to resend the patch set for adding tags?
>
> I use b4 and it should collect Reviewed-by, Acked-by, etc tags, for
> instance, if I use the message-id in your message:
>
> ⬢[acme@toolbox perf]$ b4 am -ctsl --cc-trailers 20220511022004.GA956170@leoy-ThinkPad-X240s
> Looking up https://lore.kernel.org/r/20220511022004.GA956170%40leoy-ThinkPad-X240s
> Grabbing thread from lore.kernel.org/all/20220511022004.GA956170%40leoy-ThinkPad-X240s/t.mbox.gz
> Checking for newer revisions on https://lore.kernel.org/all/
> Analyzing 12 messages in the thread
<SNIP>
> ✓ [PATCH v8 5/5] perf arm-spe: Use SPE data source for neoverse cores
> + Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
> + Link: https://lore.kernel.org/r/[email protected]
> + Cc: [email protected]
> + Cc: [email protected]
> + Cc: [email protected]
> + Cc: [email protected]
> + Cc: [email protected]
> + Cc: [email protected]
> + Cc: [email protected]
> + Cc: [email protected]
> + Cc: [email protected]
> + Cc: [email protected]
> + Cc: [email protected]
> + Cc: [email protected]
> + Cc: [email protected]
> + Cc: [email protected]
> + Cc: [email protected]
> + Cc: [email protected]
> + Cc: [email protected]
> + Cc: [email protected]
> + Cc: [email protected]
> ---
> ✓ Signed: DKIM/amazon.com
> ---
> Total patches: 5
> ---
> Cover: ./v8_20220504_alisaidi_perf_arm_spe_decode_spe_source_and_use_for_perf_c2c.cover
> Link: https://lore.kernel.org/r/[email protected]
> Base: not specified
> git am ./v8_20220504_alisaidi_perf_arm_spe_decode_spe_source_and_use_for_perf_c2c.mbx
> ⬢[acme@toolbox perf]$
>
> Somehow it is not being collected... :-\
>
> Not even when I use:
>
> > P.s. Ali missed to include German's review tag, see:
> > https://lore.kernel.org/lkml/[email protected]/
>
>
> [email protected]
>
> Will try updating b4...

Didn't help, so please collect the new tags and resubmit.

- Arnaldo