2023-07-03 15:10:48

by Uros Bizjak

[permalink] [raw]
Subject: [PATCH] drm/i915/pmu: Use local64_try_cmpxchg in i915_pmu_event_read

Use local64_try_cmpxchg instead of local64_cmpxchg (*ptr, old, new) == old
in i915_pmu_event_read. x86 CMPXCHG instruction returns success in ZF flag,
so this change saves a compare after cmpxchg (and related move instruction
in front of cmpxchg).

Also, try_cmpxchg implicitly assigns old *ptr value to "old" when cmpxchg
fails. There is no need to re-read the value in the loop.

No functional change intended.

Cc: Jani Nikula <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: David Airlie <[email protected]>
Cc: Daniel Vetter <[email protected]>
Signed-off-by: Uros Bizjak <[email protected]>
---
drivers/gpu/drm/i915/i915_pmu.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index d35973b41186..108b675088ba 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -696,12 +696,11 @@ static void i915_pmu_event_read(struct perf_event *event)
event->hw.state = PERF_HES_STOPPED;
return;
}
-again:
- prev = local64_read(&hwc->prev_count);
- new = __i915_pmu_event_read(event);

- if (local64_cmpxchg(&hwc->prev_count, prev, new) != prev)
- goto again;
+ prev = local64_read(&hwc->prev_count);
+ do {
+ new = __i915_pmu_event_read(event);
+ } while (!local64_try_cmpxchg(&hwc->prev_count, &prev, new));

local64_add(new - prev, &event->count);
}
--
2.41.0



2023-07-04 07:40:35

by Jani Nikula

[permalink] [raw]
Subject: Re: [PATCH] drm/i915/pmu: Use local64_try_cmpxchg in i915_pmu_event_read

On Mon, 03 Jul 2023, Uros Bizjak <[email protected]> wrote:
> Use local64_try_cmpxchg instead of local64_cmpxchg (*ptr, old, new) == old
> in i915_pmu_event_read. x86 CMPXCHG instruction returns success in ZF flag,
> so this change saves a compare after cmpxchg (and related move instruction
> in front of cmpxchg).
>
> Also, try_cmpxchg implicitly assigns old *ptr value to "old" when cmpxchg
> fails. There is no need to re-read the value in the loop.
>
> No functional change intended.
>
> Cc: Jani Nikula <[email protected]>
> Cc: Joonas Lahtinen <[email protected]>
> Cc: Rodrigo Vivi <[email protected]>
> Cc: Tvrtko Ursulin <[email protected]>
> Cc: David Airlie <[email protected]>
> Cc: Daniel Vetter <[email protected]>
> Signed-off-by: Uros Bizjak <[email protected]>
> ---
> drivers/gpu/drm/i915/i915_pmu.c | 9 ++++-----
> 1 file changed, 4 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
> index d35973b41186..108b675088ba 100644
> --- a/drivers/gpu/drm/i915/i915_pmu.c
> +++ b/drivers/gpu/drm/i915/i915_pmu.c
> @@ -696,12 +696,11 @@ static void i915_pmu_event_read(struct perf_event *event)
> event->hw.state = PERF_HES_STOPPED;
> return;
> }
> -again:
> - prev = local64_read(&hwc->prev_count);
> - new = __i915_pmu_event_read(event);
>
> - if (local64_cmpxchg(&hwc->prev_count, prev, new) != prev)
> - goto again;
> + prev = local64_read(&hwc->prev_count);
> + do {
> + new = __i915_pmu_event_read(event);
> + } while (!local64_try_cmpxchg(&hwc->prev_count, &prev, new));

You could save everyone a lot of time by actually documenting what these
functions do. Assume you don't know what local64_try_cmpxchg() does, and
see how many calls you have to go through to figure it out.

Because the next time I encounter this code or a patch like this, I'm
probably going to have to do that again.

To me, the old one was more readable. The optimization is meaningless to
me if it's not quantified but reduces readability.


BR,
Jani.


>
> local64_add(new - prev, &event->count);
> }

--
Jani Nikula, Intel Open Source Graphics Center

2023-07-04 08:18:02

by Uros Bizjak

[permalink] [raw]
Subject: Re: [PATCH] drm/i915/pmu: Use local64_try_cmpxchg in i915_pmu_event_read

On Tue, Jul 4, 2023 at 9:28 AM Jani Nikula <[email protected]> wrote:
>
> On Mon, 03 Jul 2023, Uros Bizjak <[email protected]> wrote:
> > Use local64_try_cmpxchg instead of local64_cmpxchg (*ptr, old, new) == old
> > in i915_pmu_event_read. x86 CMPXCHG instruction returns success in ZF flag,
> > so this change saves a compare after cmpxchg (and related move instruction
> > in front of cmpxchg).
> >
> > Also, try_cmpxchg implicitly assigns old *ptr value to "old" when cmpxchg
> > fails. There is no need to re-read the value in the loop.
> >
> > No functional change intended.
> >
> > Cc: Jani Nikula <[email protected]>
> > Cc: Joonas Lahtinen <[email protected]>
> > Cc: Rodrigo Vivi <[email protected]>
> > Cc: Tvrtko Ursulin <[email protected]>
> > Cc: David Airlie <[email protected]>
> > Cc: Daniel Vetter <[email protected]>
> > Signed-off-by: Uros Bizjak <[email protected]>
> > ---
> > drivers/gpu/drm/i915/i915_pmu.c | 9 ++++-----
> > 1 file changed, 4 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
> > index d35973b41186..108b675088ba 100644
> > --- a/drivers/gpu/drm/i915/i915_pmu.c
> > +++ b/drivers/gpu/drm/i915/i915_pmu.c
> > @@ -696,12 +696,11 @@ static void i915_pmu_event_read(struct perf_event *event)
> > event->hw.state = PERF_HES_STOPPED;
> > return;
> > }
> > -again:
> > - prev = local64_read(&hwc->prev_count);
> > - new = __i915_pmu_event_read(event);
> >
> > - if (local64_cmpxchg(&hwc->prev_count, prev, new) != prev)
> > - goto again;
> > + prev = local64_read(&hwc->prev_count);
> > + do {
> > + new = __i915_pmu_event_read(event);
> > + } while (!local64_try_cmpxchg(&hwc->prev_count, &prev, new));
>
> You could save everyone a lot of time by actually documenting what these
> functions do. Assume you don't know what local64_try_cmpxchg() does, and
> see how many calls you have to go through to figure it out.

These functions are documented in Documentation/atomic_t.txt (under
"RMW ops:" section), and the difference is explained in a separate
section "CMPXCHG vs TRY_CMPXCGS" in the same file.

Uros.

> Because the next time I encounter this code or a patch like this, I'm
> probably going to have to do that again.
>
> To me, the old one was more readable. The optimization is meaningless to
> me if it's not quantified but reduces readability.
>
>
> BR,
> Jani.
>
>
> >
> > local64_add(new - prev, &event->count);
> > }
>
> --
> Jani Nikula, Intel Open Source Graphics Center

2023-07-04 08:55:45

by Jani Nikula

[permalink] [raw]
Subject: Re: [PATCH] drm/i915/pmu: Use local64_try_cmpxchg in i915_pmu_event_read

On Tue, 04 Jul 2023, Uros Bizjak <[email protected]> wrote:
> On Tue, Jul 4, 2023 at 9:28 AM Jani Nikula <[email protected]> wrote:
>> You could save everyone a lot of time by actually documenting what these
>> functions do. Assume you don't know what local64_try_cmpxchg() does, and
>> see how many calls you have to go through to figure it out.
>
> These functions are documented in Documentation/atomic_t.txt (under
> "RMW ops:" section), and the difference is explained in a separate
> section "CMPXCHG vs TRY_CMPXCGS" in the same file.

Thanks, but *sigh*.

No kernel-doc above the functions, not even a regular comment
referencing atomic_t.txt.

$ git grep local.*_try -- Documentation
[nothing]


BR,
Jani.


--

"But the plans were on display..."

"On display? I eventually had to go down to the cellar to find them."

"That's the display department."

"With a flashlight."

"Ah, well, the lights had probably gone."

"So had the stairs."

"But look, you found the notice, didn't you?"

"Yes," said Arthur, "yes I did. It was on display in the bottom of a
locked filing cabinet stuck in a disused lavatory with a sign on the
door saying 'Beware of the Leopard'."

- Douglas Adams, The Hitchhiker's Guide to the Galaxy

--
Jani Nikula, Intel Open Source Graphics Center

2023-07-04 09:33:49

by Uros Bizjak

[permalink] [raw]
Subject: Re: [PATCH] drm/i915/pmu: Use local64_try_cmpxchg in i915_pmu_event_read

On Tue, Jul 4, 2023 at 10:37 AM Jani Nikula <[email protected]> wrote:
>
> On Tue, 04 Jul 2023, Uros Bizjak <[email protected]> wrote:
> > On Tue, Jul 4, 2023 at 9:28 AM Jani Nikula <[email protected]> wrote:
> >> You could save everyone a lot of time by actually documenting what these
> >> functions do. Assume you don't know what local64_try_cmpxchg() does, and
> >> see how many calls you have to go through to figure it out.
> >
> > These functions are documented in Documentation/atomic_t.txt (under
> > "RMW ops:" section), and the difference is explained in a separate
> > section "CMPXCHG vs TRY_CMPXCGS" in the same file.
>
> Thanks, but *sigh*.
>
> No kernel-doc above the functions, not even a regular comment
> referencing atomic_t.txt.
>
> $ git grep local.*_try -- Documentation
> [nothing]

Unfortunately, this was always the state w.r.t. local.* atomic
functions. There is an effort to improve the documentation of atomics,
perhaps it will be also extended to local variants.

Uros.

2023-10-05 16:06:50

by Jani Nikula

[permalink] [raw]
Subject: Re: [PATCH] drm/i915/pmu: Use local64_try_cmpxchg in i915_pmu_event_read

On Mon, 03 Jul 2023, Uros Bizjak <[email protected]> wrote:
> Use local64_try_cmpxchg instead of local64_cmpxchg (*ptr, old, new) == old
> in i915_pmu_event_read. x86 CMPXCHG instruction returns success in ZF flag,
> so this change saves a compare after cmpxchg (and related move instruction
> in front of cmpxchg).
>
> Also, try_cmpxchg implicitly assigns old *ptr value to "old" when cmpxchg
> fails. There is no need to re-read the value in the loop.
>
> No functional change intended.
>
> Cc: Jani Nikula <[email protected]>
> Cc: Joonas Lahtinen <[email protected]>
> Cc: Rodrigo Vivi <[email protected]>
> Cc: Tvrtko Ursulin <[email protected]>
> Cc: David Airlie <[email protected]>
> Cc: Daniel Vetter <[email protected]>
> Signed-off-by: Uros Bizjak <[email protected]>
> ---
> drivers/gpu/drm/i915/i915_pmu.c | 9 ++++-----
> 1 file changed, 4 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
> index d35973b41186..108b675088ba 100644
> --- a/drivers/gpu/drm/i915/i915_pmu.c
> +++ b/drivers/gpu/drm/i915/i915_pmu.c
> @@ -696,12 +696,11 @@ static void i915_pmu_event_read(struct perf_event *event)
> event->hw.state = PERF_HES_STOPPED;
> return;
> }
> -again:
> - prev = local64_read(&hwc->prev_count);
> - new = __i915_pmu_event_read(event);
>
> - if (local64_cmpxchg(&hwc->prev_count, prev, new) != prev)
> - goto again;
> + prev = local64_read(&hwc->prev_count);
> + do {
> + new = __i915_pmu_event_read(event);
> + } while (!local64_try_cmpxchg(&hwc->prev_count, &prev, new));

Chased through the documentation again, and pushed to drm-intel-next.

Thanks for the patch.

BR,
Jani.

>
> local64_add(new - prev, &event->count);
> }

--
Jani Nikula, Intel