2023-09-01 13:46:46

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [RFC PATCH] drm/ssd130x: Allocate buffer in the CRTC's .atomic_check() callback

Hi Maxime,

On Fri, Sep 1, 2023 at 2:00 PM Maxime Ripard <[email protected]> wrote:
> On Fri, Sep 01, 2023 at 10:36:17AM +0200, Geert Uytterhoeven wrote:
> > On Fri, Sep 1, 2023 at 10:22 AM Maxime Ripard <[email protected]> wrote:
> > > On Wed, Aug 30, 2023 at 08:25:08AM +0200, Javier Martinez Canillas wrote:
> > > > The commit 45b58669e532 ("drm/ssd130x: Allocate buffer in the plane's
> > > > .atomic_check() callback") moved the allocation of the intermediate and
> > > > HW buffers from the encoder's .atomic_enable callback to primary plane's
> > > > .atomic_check callback.
> > > >
> > > > This was suggested by Maxime Ripard because drivers aren't allowed to fail
> > > > after drm_atomic_helper_swap_state() has been called, and the encoder's
> > > > .atomic_enable happens after the new atomic state has been swapped.
> > > >
> > > > But that change caused a performance regression in very slow platforms,
> > > > since now the allocation happens for every plane's atomic state commit.
> > > > For example, Geert Uytterhoeven reports that is the case on a VexRiscV
> > > > softcore (RISC-V CPU implementation on an FPGA).
> > >
> > > I'd like to have numbers on that. It's a bit surprising to me that,
> > > given how many objects we already allocate during a commit, two small
> > > additional allocations affect performances so dramatically, even on a
> > > slow platform.
> >
> > To be fair, I didn't benchmark that. Perhaps it's just too slow due to
> > all these other allocations (and whatever else happens).
> >
> > I just find it extremely silly to allocate a buffer over and over again,
> > while we know that buffer is needed for each and every display update.
>
> Maybe it's silly, but I guess it depends on what you want to optimize
> for. You won't know the size of that buffer before you're in
> atomic_check. So it's a different trade-off than you would like, but I
> wouldn't call it extremely silly.

The size of ssd130x_plane_state.data_array[] is fixed, and depends
on the actual display connected.
The size of ssd130x_plane_state.buffer[] is also fixed, and depends
on the plane's size (which is currently fixed to the display size).

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


2023-09-05 16:24:59

by Javier Martinez Canillas

[permalink] [raw]
Subject: Re: [RFC PATCH] drm/ssd130x: Allocate buffer in the CRTC's .atomic_check() callback

Geert Uytterhoeven <[email protected]> writes:

Hello Geert,

> Hi Maxime,
>
> On Fri, Sep 1, 2023 at 2:00 PM Maxime Ripard <[email protected]> wrote:
>> On Fri, Sep 01, 2023 at 10:36:17AM +0200, Geert Uytterhoeven wrote:
>> > On Fri, Sep 1, 2023 at 10:22 AM Maxime Ripard <[email protected]> wrote:
>> > > On Wed, Aug 30, 2023 at 08:25:08AM +0200, Javier Martinez Canillas wrote:
>> > > > The commit 45b58669e532 ("drm/ssd130x: Allocate buffer in the plane's
>> > > > .atomic_check() callback") moved the allocation of the intermediate and
>> > > > HW buffers from the encoder's .atomic_enable callback to primary plane's
>> > > > .atomic_check callback.
>> > > >
>> > > > This was suggested by Maxime Ripard because drivers aren't allowed to fail
>> > > > after drm_atomic_helper_swap_state() has been called, and the encoder's
>> > > > .atomic_enable happens after the new atomic state has been swapped.
>> > > >
>> > > > But that change caused a performance regression in very slow platforms,
>> > > > since now the allocation happens for every plane's atomic state commit.
>> > > > For example, Geert Uytterhoeven reports that is the case on a VexRiscV
>> > > > softcore (RISC-V CPU implementation on an FPGA).
>> > >
>> > > I'd like to have numbers on that. It's a bit surprising to me that,
>> > > given how many objects we already allocate during a commit, two small
>> > > additional allocations affect performances so dramatically, even on a
>> > > slow platform.
>> >
>> > To be fair, I didn't benchmark that. Perhaps it's just too slow due to
>> > all these other allocations (and whatever else happens).
>> >
>> > I just find it extremely silly to allocate a buffer over and over again,
>> > while we know that buffer is needed for each and every display update.
>>
>> Maybe it's silly, but I guess it depends on what you want to optimize
>> for. You won't know the size of that buffer before you're in
>> atomic_check. So it's a different trade-off than you would like, but I
>> wouldn't call it extremely silly.
>
> The size of ssd130x_plane_state.data_array[] is fixed, and depends
> on the actual display connected.

Agreed.

> The size of ssd130x_plane_state.buffer[] is also fixed, and depends
> on the plane's size (which is currently fixed to the display size).
>

Well, one could say that also depends on the format chosen. That is, if
XRGB8888 is used then its size should be the fixed display size but if R1
is used its size could be 0 (since the shadow plane will already store the
pixels in R1 format).

> Gr{oetje,eeting}s,
>

--
Best regards,

Javier Martinez Canillas
Core Platforms
Red Hat

2023-09-06 19:19:53

by Maxime Ripard

[permalink] [raw]
Subject: Re: [RFC PATCH] drm/ssd130x: Allocate buffer in the CRTC's .atomic_check() callback

On Fri, Sep 01, 2023 at 02:08:11PM +0200, Geert Uytterhoeven wrote:
> Hi Maxime,
>
> On Fri, Sep 1, 2023 at 2:00 PM Maxime Ripard <[email protected]> wrote:
> > On Fri, Sep 01, 2023 at 10:36:17AM +0200, Geert Uytterhoeven wrote:
> > > On Fri, Sep 1, 2023 at 10:22 AM Maxime Ripard <[email protected]> wrote:
> > > > On Wed, Aug 30, 2023 at 08:25:08AM +0200, Javier Martinez Canillas wrote:
> > > > > The commit 45b58669e532 ("drm/ssd130x: Allocate buffer in the plane's
> > > > > .atomic_check() callback") moved the allocation of the intermediate and
> > > > > HW buffers from the encoder's .atomic_enable callback to primary plane's
> > > > > .atomic_check callback.
> > > > >
> > > > > This was suggested by Maxime Ripard because drivers aren't allowed to fail
> > > > > after drm_atomic_helper_swap_state() has been called, and the encoder's
> > > > > .atomic_enable happens after the new atomic state has been swapped.
> > > > >
> > > > > But that change caused a performance regression in very slow platforms,
> > > > > since now the allocation happens for every plane's atomic state commit.
> > > > > For example, Geert Uytterhoeven reports that is the case on a VexRiscV
> > > > > softcore (RISC-V CPU implementation on an FPGA).
> > > >
> > > > I'd like to have numbers on that. It's a bit surprising to me that,
> > > > given how many objects we already allocate during a commit, two small
> > > > additional allocations affect performances so dramatically, even on a
> > > > slow platform.
> > >
> > > To be fair, I didn't benchmark that. Perhaps it's just too slow due to
> > > all these other allocations (and whatever else happens).
> > >
> > > I just find it extremely silly to allocate a buffer over and over again,
> > > while we know that buffer is needed for each and every display update.
> >
> > Maybe it's silly, but I guess it depends on what you want to optimize
> > for. You won't know the size of that buffer before you're in
> > atomic_check. So it's a different trade-off than you would like, but I
> > wouldn't call it extremely silly.
>
> The size of ssd130x_plane_state.data_array[] is fixed, and depends
> on the actual display connected.

That one can be tied to the CRTC state if needed. It would only be
allocated on each modeset, so probably once for that kind of device.

> The size of ssd130x_plane_state.buffer[] is also fixed, and depends
> on the plane's size (which is currently fixed to the display size).

Doesn't it depend on the format as well?

Maxime