by Liang, Kan

[permalink] [raw]

On 2024-03-01 11:37 a.m., Ian Rogers wrote:
> On Fri, Mar 1, 2024 at 6:52 AM Liang, Kan <[email protected]> wrote:
>>
>>
>>
>> On 2024-02-29 8:01 p.m., Ian Rogers wrote:
>>> On Thu, Feb 29, 2024 at 1:15 PM Liang, Kan <[email protected]> wrote:
>>>>
>>>>
>>>>
>>>> On 2024-02-28 7:17 p.m., Ian Rogers wrote:
>>>>> Allow duplicated metric to be dropped from json files.
>>>>>
>>>>> Signed-off-by: Ian Rogers <[email protected]>
>>>>> ---
>>>>> tools/perf/pmu-events/intel_metrics.py | 51 ++++++++++++++++++++++++++
>>>>> 1 file changed, 51 insertions(+)
>>>>>
>>>>> diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
>>>>> index 20c25d142f24..1096accea2aa 100755
>>>>> --- a/tools/perf/pmu-events/intel_metrics.py
>>>>> +++ b/tools/perf/pmu-events/intel_metrics.py
>>>>> @@ -7,6 +7,7 @@ import argparse
>>>>> import json
>>>>> import math
>>>>> import os
>>>>> +from typing import Optional
>>>>>
>>>>> parser = argparse.ArgumentParser(description="Intel perf json generator")
>>>>> parser.add_argument("-metricgroups", help="Generate metricgroups data", action='store_true')
>>>>> @@ -77,10 +78,60 @@ def Smi() -> MetricGroup:
>>>>> ])
>>>>>
>>>>>
>>>>> +def Tsx() -> Optional[MetricGroup]:
>>>>> + if args.model not in [
>>>>> + 'alderlake',
>>>>> + 'cascadelakex',
>>>>> + 'icelake',
>>>>> + 'icelakex',
>>>>> + 'rocketlake',
>>>>> + 'sapphirerapids',
>>>>> + 'skylake',
>>>>> + 'skylakex',
>>>>> + 'tigerlake',> + ]:
>>>>
>>>> Can we get ride of the model list? Otherwise, we have to keep updating
>>>> the list.
>>>
>>> Do we expect the list to update? :-)
>>
>> Yes, at least for the meteorlake and graniterapids. They should be the
>> same as alderlake and sapphirerapids. I'm not sure about the future
>> platforms.
>>
>> Maybe we can have a if args.model in list here to include all the
>> non-hybrid models which doesn't support TSX. I think the list should not
>> be changed shortly.
>>
>>> The issue is the events are in
>>> sysfs and not the json. If we added the tsx events to json then this
>>> list wouldn't be necessary, but it also would mean the events would be
>>> present in "perf list" even when TSX is disabled.
>>
>> I think there may an alternative way, to check the RTM events, e.g.,
>> RTM_RETIRED.START event. We only need to generate the metrics for the
>> platform which supports the RTM_RETIRED.START event.
>>
>>
>>>
>>>>> + return None
>>>>> +> + pmu = "cpu_core" if args.model == "alderlake" else "cpu"
>>>>
>>>> Is it possible to change the check to the existence of the "cpu" PMU
>>>> here? has_pmu("cpu") ? "cpu" : "cpu_core"
>>>
>>> The "Unit" on "cpu" events in json always just blank. On hybrid it is
>>> either "cpu_core" or "cpu_atom", so I can make this something like:
>>>
>>> pmu = "cpu_core" if metrics.HasPmu("cpu_core") else "cpu"
>>>
>>> which would be a build time test.
>>
>> Yes, I think using the "Unit" is good enough.
>>
>>>
>>>
>>>>> + cycles = Event('cycles')
>>>>> + cycles_in_tx = Event(f'{pmu}/cycles\-t/')
>>>>> + transaction_start = Event(f'{pmu}/tx\-start/')
>>>>> + cycles_in_tx_cp = Event(f'{pmu}/cycles\-ct/')
>>>>> + metrics = [
>>>>> + Metric('tsx_transactional_cycles',
>>>>> + 'Percentage of cycles within a transaction region.',
>>>>> + Select(cycles_in_tx / cycles, has_event(cycles_in_tx), 0),
>>>>> + '100%'),
>>>>> + Metric('tsx_aborted_cycles', 'Percentage of cycles in aborted transactions.',
>>>>> + Select(max(cycles_in_tx - cycles_in_tx_cp, 0) / cycles,
>>>>> + has_event(cycles_in_tx),
>>>>> + 0),
>>>>> + '100%'),
>>>>> + Metric('tsx_cycles_per_transaction',
>>>>> + 'Number of cycles within a transaction divided by the number of transactions.',
>>>>> + Select(cycles_in_tx / transaction_start,
>>>>> + has_event(cycles_in_tx),
>>>>> + 0),
>>>>> + "cycles / transaction"),
>>>>> + ]
>>>>> + if args.model != 'sapphirerapids':
>>>>
>>>> Add the "tsx_cycles_per_elision" metric only if
>>>> has_event(f'{pmu}/el\-start/')?
>>>
>>> It's a sysfs event, so this wouldn't work :-(
>>
>> The below is the definition of el-start in the kernel.
>> EVENT_ATTR_STR(el-start, el_start, "event=0xc8,umask=0x1");
>>
>> The corresponding event in the event list should be HLE_RETIRED.START
>> "EventCode": "0xC8",
>> "UMask": "0x01",
>> "EventName": "HLE_RETIRED.START",
>>
>> I think we may check the HLE_RETIRED.START instead. If the
>> HLE_RETIRED.START doesn't exist, I don't see a reason why the
>> tsx_cycles_per_elision should be supported.
>>
>> Again, in the virtualization world, it's possible that the
>> HLE_RETIRED.START exists in the event list but el_start isn't available
>> in the sysfs. I think it has to be specially handle in the test as well.
>
> So we keep the has_event test on the sysfs event to handle the
> virtualization and disabled case. We use HLE_RETIRED.START to detect
> whether the model supports TSX.

Yes. I think the JSON event always keeps the latest status of an event.
If an event is deprecated someday, I don't think there is a reason to
keep any metrics including the event. So we should use it to check
whether to generate a metrics.

The sysfs event tells if the current kernel support the event. It should
be used to check whether a metrics should be used/enabled.

> Should the event be the sysfs or json
> version? i.e.
>
> "MetricExpr": "(cycles\\-t / el\\-start if
> has_event(el\\-start) else 0)",
>
> or
>
> "MetricExpr": "(cycles\\-t / HLE_RETIRED.START if
> has_event(el\\-start) else 0)",
>
> I think I favor the former for some consistency with the has_event.
>

Agree, the former looks good to me too.

> Using HLE_RETIRED.START means the set of TSX models goes from:
> 'alderlake',
> 'cascadelakex',
> 'icelake',
> 'icelakex',
> 'rocketlake',
> 'sapphirerapids',
> 'skylake',
> 'skylakex',
> 'tigerlake',
>
> To:
> broadwell
> broadwellde
> broadwellx
> cascadelakex
> haswell
> haswellx
> icelake
> rocketlake
> skylake
> skylakex
>
> Using RTM_RETIRED.START it goes to:
> broadwell
> broadwellde
> broadwellx
> cascadelakex
> emeraldrapids
> graniterapids
> haswell
> haswellx
> icelake
> icelakex
> rocketlake
> sapphirerapids
> skylake
> skylakex
> tigerlake
>
> So I'm not sure it is working equivalently to what we have today,
> which may be good or bad. Here is what I think the code should look
> like:

Yes, there should be some changes. But I think the changes should be good.

For icelakex, the HLE_RETIRED.START has been deprecated. I don't see a
reason why should perf keep the tsx_cycles_per_elision metric.

For alderlake, TSX is deprecated. The perf should drop the related
metrics as well.
https://edc.intel.com/content/www/us/en/design/ipla/software-development-platforms/client/platforms/alder-lake-desktop/12th-generation-intel-core-processors-datasheet-volume-1-of-2/001/deprecated-technologies/

>
> def Tsx() -> Optional[MetricGroup]:
> pmu = "cpu_core" if CheckPmu("cpu_core") else "cpu"
> cycles = Event('cycles')
> cycles_in_tx = Event(f'{pmu}/cycles\-t/')
> cycles_in_tx_cp = Event(f'{pmu}/cycles\-ct/')
> try:
> # Test if the tsx event is present in the json, prefer the
> # sysfs version so that we can detect its presence at runtime.
> transaction_start = Event("RTM_RETIRED.START")
> transaction_start = Event(f'{pmu}/tx\-start/')
> except:
> return None
>
> elision_start = None
> try:
> # Elision start isn't supported by all models, but we'll not
> # generate the tsx_cycles_per_elision metric in that
> # case. Again, prefer the sysfs encoding of the event.
> elision_start = Event("HLE_RETIRED.START")
> elision_start = Event(f'{pmu}/el\-start/')
> except:
> pass
>
> return MetricGroup('transaction', [
> Metric('tsx_transactional_cycles',
> 'Percentage of cycles within a transaction region.',
> Select(cycles_in_tx / cycles, has_event(cycles_in_tx), 0),
> '100%'),
> Metric('tsx_aborted_cycles', 'Percentage of cycles in aborted
> transactions.',
> Select(max(cycles_in_tx - cycles_in_tx_cp, 0) / cycles,
> has_event(cycles_in_tx),
> 0),
> '100%'),
> Metric('tsx_cycles_per_transaction',
> 'Number of cycles within a transaction divided by the
> number of transactions.',
> Select(cycles_in_tx / transaction_start,
> has_event(cycles_in_tx),
> 0),
> "cycles / transaction"),
> Metric('tsx_cycles_per_elision',
> 'Number of cycles within a transaction divided by the
> number of elisions.',
> Select(cycles_in_tx / elision_start,
> has_event(elision_start),
> 0),
> "cycles / elision") if elision_start else None,
> ], description="Breakdown of transactional memory statistics")
>
> Wdyt?

Looks good to me.

Thanks,
Kan
>
> Thanks,
> Ian
>
>> Thanks,
>> Kan
>>
>>>
>>> Thanks,
>>> Ian
>>>
>>>> Thanks,
>>>> Kan
>>>>
>>>>> + elision_start = Event(f'{pmu}/el\-start/')
>>>>> + metrics += [
>>>>> + Metric('tsx_cycles_per_elision',
>>>>> + 'Number of cycles within a transaction divided by the number of elisions.',
>>>>> + Select(cycles_in_tx / elision_start,
>>>>> + has_event(elision_start),
>>>>> + 0),
>>>>> + "cycles / elision"),
>>>>> + ]
>>>>> + return MetricGroup('transaction', metrics)
>>>>> +
>>>>> +
>>>>> all_metrics = MetricGroup("", [
>>>>> Idle(),
>>>>> Rapl(),
>>>>> Smi(),
>>>>> + Tsx(),
>>>>> ])
>>>>>
>>>>> if args.metricgroups:
>>>

2024-03-01 23:09:50

by Ian Rogers

[permalink] [raw]

Subject: Re: [PATCH v1 02/20] perf jevents: Add idle metric for Intel models

On Fri, Mar 1, 2024 at 1:34 PM Andi Kleen <[email protected]> wrote:
>
> >
> > I see some of the gains as:
> > - metrics that are human intelligible,
> > - metrics for models that are no longer being updated,
> > - removing copy-paste of metrics like tsx and smi across each model's
> > metric json (less lines-of-code),
> > - validation of events in a metric expression being in the event json
> > for a model,
> > - removal of forward porting metrics to a new model if the event
> > names of the new model line up with those of previous,
> > - in this patch kit there are metrics added that don't currently
> > exist (more metrics should be better for users - yes there can always
> > be bugs).
>
> But then we have two ways to do things, and we already have a lot
> of problems with regressions from complexity and a growing
> bug backlog that nobody fixes.

If you want something to work you put a test on it. We have a number
of both event and metric tests. I'm not sure what the bug backlog you
are mentioning is, but as far as I can see the tool is in the best
condition it has ever been. All tests passing with address sanitizer
was a particular milestone last year.

> Multiple ways to do basic operations seems just a recipe for
> more and more fragmentation and similar problems.
>
> The JSON format is certainly not perfect and has its share
> of issues, but at least it's a standard now that is supported
> by many vendors and creating new standards just because
> you don't like some minor aspects doesn't seem like
> a good approach. I'm sure the next person will come around
> why wants Ruby metrics and the third would prefer to write
> them in Rust. Who knows where it will stop.

These patches don't make the json format disappear, we use python to
generate the json metrics as json strings are a poor programming
language.

I agree we have too many formats, but json is part of the problem
there not the solution. I would like to make the only format the sysfs
one, and then we can do like a unionfs type thing in the perf tool
where we can have sysfs, a sysfs layer built into the tool (out of the
json) and possibly user specified layers. This would allow
customizability once the binary is built, but it would also allow us
to test with a sysfs for a machine we don't have. Linux on M1 macs are
a particular issue, but we recently had an issue with the layout of
the format directory for Intel uncore_pcu pre-Skylake which doesn't
have a umask. Finding such machines to test on is the challenge, and
abstracting sysfs as a unionfs type thing is, I think, the correct
approach.

I don't think the Linux build has tooling around Ruby, and there are
no host tools written in Rust yet. Will it happen? Probably, and I
think it is good the codebase keeps moving forward. Before the C
reference count checking implementation, we were talking about
rewriting at least pieces like libperf in Rust - the code was leaking
memory and it seemed unsolvable as reasonable fixes would yield
use-after-frees and crashes. I've even mentioned this in LWN comments
on articles around Rust, nobody stepped up with a fix until I did the
reference count checking.

Python is a good choice for reading json as the inbuilt library is of
a reasonable quality. Python is good for what I've done here as the
operator overloading makes the expressions readable. We can read in
and out of the python tree format, and do so in jevents.py to validate
the metrics can parse (we still have the C parse test). We haven't
written a full expression parser in python, although it wouldn't be
hard, we just ack the string and pretty much call eval. It'd be
relatively easy to add an output function to the python code to make
it convert the expressions to a different programming language, for
example the ToPython code here:
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/pmu-events/metric.py?h=perf-tools-next#n17

> Also in my experience this python stuff is unreliable because
> half the people who build perf forget to install the python
> libraries. Json at least works always.

It has been the case for about a year (v6.4):
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/commit/?h=perf-tools-next&id=175f9315f76345e88e3abdc947c1e0030ab99da3
that if we can't build jevents because of python then the build fails.
You can explicitly request not to use python/jevents with
NO_JEVENTS=1, but it is an explicit opt-out.
I don't think this is unreliable. We've recently made BPF opt-out
rather than opt-in in a similar way, and that requires clang, etc. It
has been a problem in the past that implicit opt-in and opt-out could
give you a perf tool that a distribution couldn't ship (mixed GPLv2
and v3 code) or that was missing useful things. I think we've fixed
the bug by making the build fail unless you explicitly opt-out of
options we think you should have.

Fwiw, there is a similar bug that BTF support in the kernel is opt-in
rather than opt-out, meaning distributions ship BPF tools that can't
work for the kernel they've built. If there were more time I'd be
looking to make BTF opt-out rather than opt-in, I reported the issue
on the BPF mailing list.

> Incrementional improvements are usually the way to do these
> things.

We've had jevents as python for nearly 2 years. metric.py that this
code is building off has been in the tree for 15 months. I wrote the
code and there is a version of it for:
https://github.com/intel/perfmon/commits/main/scripts/create_perf_json.py
which is 2 years old. I don't see anything non-incremental, if
anything things have been slow to move forward. It's true vendors
haven't really adopted the code outside of Intel's perfmon, I've at
least discussed it with them in face-to-faces like LPC. Hopefully this
work is a foundation for vendors to write more metrics, it should be
little more struggle than it is for them to document the metric in
their manuals.

ARM have a python based json tool for perf (similar to the perfmon one) here:
https://gitlab.arm.com/telemetry-solution/telemetry-solution/-/tree/main/tools/perf_json_generator
So I'd say that python and perf json is a standard approach. ARM's
converter is just over a year old.

Thanks,
Ian

> -Andi