2020-09-11 17:11:06

by Rodrigo Siqueira Jordao

[permalink] [raw]
Subject: [PATCH v2 0/4] Enlarge tracepoints in the display component

Debug issues related to display can be a challenge due to the complexity
around this topic and different source of information might help in this
process. We already have support for tracepoints inside the display
component, i.e., we have the basic functionalities available and we just
need to expand it in order to make it more valuable for debugging. For
this reason, this patchset reworks part of the current tracepoint
options and add different sets of tracing inside amdgpu_dm, display
core, and DCN10. The first patch of this series just rework part of the
current tracepoints and the last set of patches introduces new
tracepoints.

This first patchset version is functional. Please, let me know what I
can improve in the current version but also let me know what kind of
tracepoint I can add for the next version.

Finally, I want to highlight that this work is based on a set of patches
originally made by Nicholas Kazlauskas.

Change in V2:
- I added another patch for capturing the clock state for different display
architecture.

Rodrigo Siqueira (4):
drm/amd/display: Rework registers tracepoint
drm/amd/display: Add tracepoint for amdgpu_dm
drm/amd/display: Add pipe_state tracepoint
drm/amd/display: Add tracepoint for capturing clocks state

.../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 17 +
.../amd/display/amdgpu_dm/amdgpu_dm_trace.h | 712 +++++++++++++++++-
.../dc/clk_mgr/dce112/dce112_clk_mgr.c | 5 +
.../display/dc/clk_mgr/dcn10/rv1_clk_mgr.c | 4 +
.../display/dc/clk_mgr/dcn20/dcn20_clk_mgr.c | 4 +
.../amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c | 4 +
.../display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c | 4 +
drivers/gpu/drm/amd/display/dc/core/dc.c | 11 +
.../gpu/drm/amd/display/dc/dce/dce_clk_mgr.c | 5 +
.../amd/display/dc/dcn10/dcn10_hw_sequencer.c | 17 +-
10 files changed, 747 insertions(+), 36 deletions(-)

--
2.28.0


2020-09-16 09:15:43

by Daniel Vetter

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] Enlarge tracepoints in the display component

On Fri, Sep 11, 2020 at 10:59:23AM -0400, Rodrigo Siqueira wrote:
> Debug issues related to display can be a challenge due to the complexity
> around this topic and different source of information might help in this
> process. We already have support for tracepoints inside the display
> component, i.e., we have the basic functionalities available and we just
> need to expand it in order to make it more valuable for debugging. For
> this reason, this patchset reworks part of the current tracepoint
> options and add different sets of tracing inside amdgpu_dm, display
> core, and DCN10. The first patch of this series just rework part of the
> current tracepoints and the last set of patches introduces new
> tracepoints.
>
> This first patchset version is functional. Please, let me know what I
> can improve in the current version but also let me know what kind of
> tracepoint I can add for the next version.
>
> Finally, I want to highlight that this work is based on a set of patches
> originally made by Nicholas Kazlauskas.
>
> Change in V2:
> - I added another patch for capturing the clock state for different display
> architecture.

Hm I'm not super sure tracepoints for state dumping are the right thing
here. We kinda have the atomic state dumping code with all the various
callbacks, and you can extend that pretty easily. Gives you full state
dump in debugfs, plus a few function to dump into dmesg.

Maybe what we need is a function to dump this also into printk tracepoint
(otoh with Sean Paul's tracepoint work we'd get that through the dmesg
stuff already), and then you could do it there?

Upside is that for customers they'd get a much more consistent way to
debug display issues across different drivers.

For low-level hw debug what we do is give the hw guys an mmio trace, and
they replay it on the fancy boxes :-) So for that I think this here is
again too high level, but maybe what you have is a bit different.
-Daniel

>
> Rodrigo Siqueira (4):
> drm/amd/display: Rework registers tracepoint
> drm/amd/display: Add tracepoint for amdgpu_dm
> drm/amd/display: Add pipe_state tracepoint
> drm/amd/display: Add tracepoint for capturing clocks state
>
> .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 17 +
> .../amd/display/amdgpu_dm/amdgpu_dm_trace.h | 712 +++++++++++++++++-
> .../dc/clk_mgr/dce112/dce112_clk_mgr.c | 5 +
> .../display/dc/clk_mgr/dcn10/rv1_clk_mgr.c | 4 +
> .../display/dc/clk_mgr/dcn20/dcn20_clk_mgr.c | 4 +
> .../amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c | 4 +
> .../display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c | 4 +
> drivers/gpu/drm/amd/display/dc/core/dc.c | 11 +
> .../gpu/drm/amd/display/dc/dce/dce_clk_mgr.c | 5 +
> .../amd/display/dc/dcn10/dcn10_hw_sequencer.c | 17 +-
> 10 files changed, 747 insertions(+), 36 deletions(-)
>
> --
> 2.28.0
>

--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

2020-09-16 20:49:16

by Kazlauskas, Nicholas

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] Enlarge tracepoints in the display component

On 2020-09-16 5:12 a.m., Daniel Vetter wrote:
> On Fri, Sep 11, 2020 at 10:59:23AM -0400, Rodrigo Siqueira wrote:
>> Debug issues related to display can be a challenge due to the complexity
>> around this topic and different source of information might help in this
>> process. We already have support for tracepoints inside the display
>> component, i.e., we have the basic functionalities available and we just
>> need to expand it in order to make it more valuable for debugging. For
>> this reason, this patchset reworks part of the current tracepoint
>> options and add different sets of tracing inside amdgpu_dm, display
>> core, and DCN10. The first patch of this series just rework part of the
>> current tracepoints and the last set of patches introduces new
>> tracepoints.
>>
>> This first patchset version is functional. Please, let me know what I
>> can improve in the current version but also let me know what kind of
>> tracepoint I can add for the next version.
>>
>> Finally, I want to highlight that this work is based on a set of patches
>> originally made by Nicholas Kazlauskas.
>>
>> Change in V2:
>> - I added another patch for capturing the clock state for different display
>> architecture.
>
> Hm I'm not super sure tracepoints for state dumping are the right thing
> here. We kinda have the atomic state dumping code with all the various
> callbacks, and you can extend that pretty easily. Gives you full state
> dump in debugfs, plus a few function to dump into dmesg.
>
> Maybe what we need is a function to dump this also into printk tracepoint
> (otoh with Sean Paul's tracepoint work we'd get that through the dmesg
> stuff already), and then you could do it there?
>
> Upside is that for customers they'd get a much more consistent way to
> debug display issues across different drivers.
>
> For low-level hw debug what we do is give the hw guys an mmio trace, and
> they replay it on the fancy boxes :-) So for that I think this here is
> again too high level, but maybe what you have is a bit different.
> -Daniel

We have raw register traces, but what I find most useful is to be able
to see are the incoming DRM IOCTLs, objects and properties per commit.

Many of the bugs we see in display code is in the conversion from DRM ->
DM -> DC state. The current HW state is kind of useless in most cases,
but the sequence helps track down intermittent problems and understand
state transitions.

Tracepoints provide everything I really need to be able to track down
these problems without falling back to a full debugger. The existing DRM
prints (even at high logging levels) aren't enough to understand what's
going on in most cases in our driver so funneling those into tracepoints
to improve perf doesn't really help that much.

I think this kind of idea was rejected for DRM core last year with
Sean's patch series but if we can't get them into core then I'd like to
get them into our driver at least. These are a cleaned up version of
Sean's work + my work that I end up applying locally whenever I debug
something.

Regards,
Nicholas Kazlauskas

>
>>
>> Rodrigo Siqueira (4):
>> drm/amd/display: Rework registers tracepoint
>> drm/amd/display: Add tracepoint for amdgpu_dm
>> drm/amd/display: Add pipe_state tracepoint
>> drm/amd/display: Add tracepoint for capturing clocks state
>>
>> .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 17 +
>> .../amd/display/amdgpu_dm/amdgpu_dm_trace.h | 712 +++++++++++++++++-
>> .../dc/clk_mgr/dce112/dce112_clk_mgr.c | 5 +
>> .../display/dc/clk_mgr/dcn10/rv1_clk_mgr.c | 4 +
>> .../display/dc/clk_mgr/dcn20/dcn20_clk_mgr.c | 4 +
>> .../amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c | 4 +
>> .../display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c | 4 +
>> drivers/gpu/drm/amd/display/dc/core/dc.c | 11 +
>> .../gpu/drm/amd/display/dc/dce/dce_clk_mgr.c | 5 +
>> .../amd/display/dc/dcn10/dcn10_hw_sequencer.c | 17 +-
>> 10 files changed, 747 insertions(+), 36 deletions(-)
>>
>> --
>> 2.28.0
>>
>

2020-09-17 11:42:02

by Daniel Vetter

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] Enlarge tracepoints in the display component

On Wed, Sep 16, 2020 at 11:27:27AM -0400, Kazlauskas, Nicholas wrote:
> On 2020-09-16 5:12 a.m., Daniel Vetter wrote:
> > On Fri, Sep 11, 2020 at 10:59:23AM -0400, Rodrigo Siqueira wrote:
> > > Debug issues related to display can be a challenge due to the complexity
> > > around this topic and different source of information might help in this
> > > process. We already have support for tracepoints inside the display
> > > component, i.e., we have the basic functionalities available and we just
> > > need to expand it in order to make it more valuable for debugging. For
> > > this reason, this patchset reworks part of the current tracepoint
> > > options and add different sets of tracing inside amdgpu_dm, display
> > > core, and DCN10. The first patch of this series just rework part of the
> > > current tracepoints and the last set of patches introduces new
> > > tracepoints.
> > >
> > > This first patchset version is functional. Please, let me know what I
> > > can improve in the current version but also let me know what kind of
> > > tracepoint I can add for the next version.
> > >
> > > Finally, I want to highlight that this work is based on a set of patches
> > > originally made by Nicholas Kazlauskas.
> > >
> > > Change in V2:
> > > - I added another patch for capturing the clock state for different display
> > > architecture.
> >
> > Hm I'm not super sure tracepoints for state dumping are the right thing
> > here. We kinda have the atomic state dumping code with all the various
> > callbacks, and you can extend that pretty easily. Gives you full state
> > dump in debugfs, plus a few function to dump into dmesg.
> >
> > Maybe what we need is a function to dump this also into printk tracepoint
> > (otoh with Sean Paul's tracepoint work we'd get that through the dmesg
> > stuff already), and then you could do it there?
> >
> > Upside is that for customers they'd get a much more consistent way to
> > debug display issues across different drivers.
> >
> > For low-level hw debug what we do is give the hw guys an mmio trace, and
> > they replay it on the fancy boxes :-) So for that I think this here is
> > again too high level, but maybe what you have is a bit different.
> > -Daniel
>
> We have raw register traces, but what I find most useful is to be able to
> see are the incoming DRM IOCTLs, objects and properties per commit.
>
> Many of the bugs we see in display code is in the conversion from DRM -> DM
> -> DC state. The current HW state is kind of useless in most cases, but the
> sequence helps track down intermittent problems and understand state
> transitions.
>
> Tracepoints provide everything I really need to be able to track down these
> problems without falling back to a full debugger. The existing DRM prints
> (even at high logging levels) aren't enough to understand what's going on in
> most cases in our driver so funneling those into tracepoints to improve perf
> doesn't really help that much.
>
> I think this kind of idea was rejected for DRM core last year with Sean's
> patch series but if we can't get them into core then I'd like to get them
> into our driver at least. These are a cleaned up version of Sean's work + my
> work that I end up applying locally whenever I debug something.

Nah, Sean's series wasn't rejected. It's simply stuck waiting for review.
So if your goal is to get better dumping going on, I think combining this
with Sean's work (and getting that reviewed), plus then tapping into the
atomic state dumping code. Then you know what was requested, plus what your
atomic_check code computed should be the hw state, and you can compare
that with the register dumps you already grab.

Feels at least like a more complete and flexible solution than ad-hoc
tracepoints for debuggin in each driver. The idea behind Sean's work is
also that we'd have a blackbox recorder for any drm issues which distros
in the field could use. So driver doing their own debug output doesn't
sound super great.

I think Siqueira already chatted a bit with Sean.
-Daniel

>

> Regards,
> Nicholas Kazlauskas
>
> >
> > >
> > > Rodrigo Siqueira (4):
> > > drm/amd/display: Rework registers tracepoint
> > > drm/amd/display: Add tracepoint for amdgpu_dm
> > > drm/amd/display: Add pipe_state tracepoint
> > > drm/amd/display: Add tracepoint for capturing clocks state
> > >
> > > .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 17 +
> > > .../amd/display/amdgpu_dm/amdgpu_dm_trace.h | 712 +++++++++++++++++-
> > > .../dc/clk_mgr/dce112/dce112_clk_mgr.c | 5 +
> > > .../display/dc/clk_mgr/dcn10/rv1_clk_mgr.c | 4 +
> > > .../display/dc/clk_mgr/dcn20/dcn20_clk_mgr.c | 4 +
> > > .../amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c | 4 +
> > > .../display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c | 4 +
> > > drivers/gpu/drm/amd/display/dc/core/dc.c | 11 +
> > > .../gpu/drm/amd/display/dc/dce/dce_clk_mgr.c | 5 +
> > > .../amd/display/dc/dcn10/dcn10_hw_sequencer.c | 17 +-
> > > 10 files changed, 747 insertions(+), 36 deletions(-)
> > >
> > > --
> > > 2.28.0
> > >
> >
>
> _______________________________________________
> dri-devel mailing list
> [email protected]
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch