2019-11-01 18:18:53

by Rob Clark

[permalink] [raw]
Subject: [PATCH 1/2] drm/atomic: fix self-refresh helpers crtc state dereference

From: Rob Clark <[email protected]>

drm_self_refresh_helper_update_avg_times() was incorrectly accessing the
new incoming state after drm_atomic_helper_commit_hw_done(). But this
state might have already been superceeded by an !nonblock atomic update
resulting in dereferencing an already free'd crtc_state.

Fixes: d4da4e33341c ("drm: Measure Self Refresh Entry/Exit times to avoid thrashing")
Cc: Sean Paul <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
---
TODO I *think* this will more or less do the right thing.. althought I'm
not 100% sure if, for example, we enter psr in a nonblock commit, and
then leave psr in a !nonblock commit that overtakes the completion of
the nonblock commit. Not sure if this sort of scenario can happen in
practice. But not crashing is better than crashing, so I guess we
should either take this patch or rever the self-refresh helpers until
Sean can figure out a better solution.

drivers/gpu/drm/drm_atomic_helper.c | 2 ++
drivers/gpu/drm/drm_self_refresh_helper.c | 11 ++++++-----
include/drm/drm_atomic.h | 8 ++++++++
3 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
index 3ef2ac52ce94..732bd0ce9241 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -2240,6 +2240,8 @@ void drm_atomic_helper_commit_hw_done(struct drm_atomic_state *old_state)
int i;

for_each_oldnew_crtc_in_state(old_state, crtc, old_crtc_state, new_crtc_state, i) {
+ old_state->crtcs[i].new_self_refresh_active = new_crtc_state->self_refresh_active;
+
commit = new_crtc_state->commit;
if (!commit)
continue;
diff --git a/drivers/gpu/drm/drm_self_refresh_helper.c b/drivers/gpu/drm/drm_self_refresh_helper.c
index 68f4765a5896..77b9079fa578 100644
--- a/drivers/gpu/drm/drm_self_refresh_helper.c
+++ b/drivers/gpu/drm/drm_self_refresh_helper.c
@@ -143,19 +143,20 @@ void drm_self_refresh_helper_update_avg_times(struct drm_atomic_state *state,
unsigned int commit_time_ms)
{
struct drm_crtc *crtc;
- struct drm_crtc_state *old_crtc_state, *new_crtc_state;
+ struct drm_crtc_state *old_crtc_state;
int i;

- for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state,
- new_crtc_state, i) {
+ for_each_old_crtc_in_state(state, crtc, old_crtc_state, i) {
+ bool new_self_refresh_active =
+ state->crtcs[i].new_self_refresh_active;
struct drm_self_refresh_data *sr_data = crtc->self_refresh_data;
struct ewma_psr_time *time;

if (old_crtc_state->self_refresh_active ==
- new_crtc_state->self_refresh_active)
+ new_self_refresh_active)
continue;

- if (new_crtc_state->self_refresh_active)
+ if (new_self_refresh_active)
time = &sr_data->entry_avg_ms;
else
time = &sr_data->exit_avg_ms;
diff --git a/include/drm/drm_atomic.h b/include/drm/drm_atomic.h
index 927e1205d7aa..86baf2b38bb3 100644
--- a/include/drm/drm_atomic.h
+++ b/include/drm/drm_atomic.h
@@ -155,6 +155,14 @@ struct __drm_crtcs_state {
struct drm_crtc *ptr;
struct drm_crtc_state *state, *old_state, *new_state;

+ /**
+ * @new_self_refresh_active:
+ *
+ * drm_atomic_helper_commit_hw_done() stashes new_crtc_state->self_refresh_active
+ * so that it can be accessed late in drm_self_refresh_helper_update_avg_times().
+ */
+ bool new_self_refresh_active;
+
/**
* @commit:
*
--
2.21.0


2019-11-01 18:19:30

by Rob Clark

[permalink] [raw]
Subject: [PATCH 2/2] drm/atomic: clear new_state pointers at hw_done

From: Rob Clark <[email protected]>

The new state should not be accessed after this point. Clear the
pointers to make that explicit.

This makes the error corrected in the previous patch more obvious.

Signed-off-by: Rob Clark <[email protected]>
---
drivers/gpu/drm/drm_atomic_helper.c | 29 +++++++++++++++++++++++++++++
1 file changed, 29 insertions(+)

diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
index 732bd0ce9241..176831df8163 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -2234,13 +2234,42 @@ EXPORT_SYMBOL(drm_atomic_helper_fake_vblank);
*/
void drm_atomic_helper_commit_hw_done(struct drm_atomic_state *old_state)
{
+ struct drm_connector *connector;
+ struct drm_connector_state *old_conn_state, *new_conn_state;
struct drm_crtc *crtc;
struct drm_crtc_state *old_crtc_state, *new_crtc_state;
+ struct drm_plane *plane;
+ struct drm_plane_state *old_plane_state, *new_plane_state;
struct drm_crtc_commit *commit;
+ struct drm_private_obj *obj;
+ struct drm_private_state *old_obj_state, *new_obj_state;
int i;

+ /*
+ * After this point, drivers should not access the permanent modeset
+ * state, so we also clear the new_state pointers to make this
+ * restriction explicit.
+ *
+ * For the CRTC state, we do this in the same loop where we signal
+ * hw_done, since we still need to new_crtc_state to fish out the
+ * commit.
+ */
+
+ for_each_oldnew_connector_in_state(old_state, connector, old_conn_state, new_conn_state, i) {
+ old_state->connectors[i].new_state = NULL;
+ }
+
+ for_each_oldnew_plane_in_state(old_state, plane, old_plane_state, new_plane_state, i) {
+ old_state->planes[i].new_state = NULL;
+ }
+
+ for_each_oldnew_private_obj_in_state(old_state, obj, old_obj_state, new_obj_state, i) {
+ old_state->private_objs[i].new_state = NULL;
+ }
+
for_each_oldnew_crtc_in_state(old_state, crtc, old_crtc_state, new_crtc_state, i) {
old_state->crtcs[i].new_self_refresh_active = new_crtc_state->self_refresh_active;
+ old_state->crtcs[i].new_state = NULL;

commit = new_crtc_state->commit;
if (!commit)
--
2.21.0

2019-11-01 18:37:27

by Maarten Lankhorst

[permalink] [raw]
Subject: Re: [PATCH 2/2] drm/atomic: clear new_state pointers at hw_done

Op 01-11-2019 om 19:07 schreef Rob Clark:
> From: Rob Clark <[email protected]>
>
> The new state should not be accessed after this point. Clear the
> pointers to make that explicit.
>
> This makes the error corrected in the previous patch more obvious.
>
> Signed-off-by: Rob Clark <[email protected]>

Would be nice if you could cc [email protected] next time, so I know our CI infrastructure is happy;

It wouldn't surprise me if it catches 1 or 2 abuses in i915.

Otherwise Reviewed-by: Maarten Lankhorst <[email protected]>

Perhaps you could mail this to version to [email protected] using git-send-email so we at least get i915 results?

> ---
> drivers/gpu/drm/drm_atomic_helper.c | 29 +++++++++++++++++++++++++++++
> 1 file changed, 29 insertions(+)
>
> diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> index 732bd0ce9241..176831df8163 100644
> --- a/drivers/gpu/drm/drm_atomic_helper.c
> +++ b/drivers/gpu/drm/drm_atomic_helper.c
> @@ -2234,13 +2234,42 @@ EXPORT_SYMBOL(drm_atomic_helper_fake_vblank);
> */
> void drm_atomic_helper_commit_hw_done(struct drm_atomic_state *old_state)
> {
> + struct drm_connector *connector;
> + struct drm_connector_state *old_conn_state, *new_conn_state;
> struct drm_crtc *crtc;
> struct drm_crtc_state *old_crtc_state, *new_crtc_state;
> + struct drm_plane *plane;
> + struct drm_plane_state *old_plane_state, *new_plane_state;
> struct drm_crtc_commit *commit;
> + struct drm_private_obj *obj;
> + struct drm_private_state *old_obj_state, *new_obj_state;
> int i;
>
> + /*
> + * After this point, drivers should not access the permanent modeset
> + * state, so we also clear the new_state pointers to make this
> + * restriction explicit.
> + *
> + * For the CRTC state, we do this in the same loop where we signal
> + * hw_done, since we still need to new_crtc_state to fish out the
> + * commit.
> + */
> +
> + for_each_oldnew_connector_in_state(old_state, connector, old_conn_state, new_conn_state, i) {
> + old_state->connectors[i].new_state = NULL;
> + }
> +
> + for_each_oldnew_plane_in_state(old_state, plane, old_plane_state, new_plane_state, i) {
> + old_state->planes[i].new_state = NULL;
> + }
> +
> + for_each_oldnew_private_obj_in_state(old_state, obj, old_obj_state, new_obj_state, i) {
> + old_state->private_objs[i].new_state = NULL;
> + }
> +
> for_each_oldnew_crtc_in_state(old_state, crtc, old_crtc_state, new_crtc_state, i) {
> old_state->crtcs[i].new_self_refresh_active = new_crtc_state->self_refresh_active;
> + old_state->crtcs[i].new_state = NULL;
>
> commit = new_crtc_state->commit;
> if (!commit)


2019-11-01 19:26:28

by Ville Syrjälä

[permalink] [raw]
Subject: Re: [PATCH 2/2] drm/atomic: clear new_state pointers at hw_done

On Fri, Nov 01, 2019 at 11:07:13AM -0700, Rob Clark wrote:
> From: Rob Clark <[email protected]>
>
> The new state should not be accessed after this point. Clear the
> pointers to make that explicit.
>
> This makes the error corrected in the previous patch more obvious.
>
> Signed-off-by: Rob Clark <[email protected]>
> ---
> drivers/gpu/drm/drm_atomic_helper.c | 29 +++++++++++++++++++++++++++++
> 1 file changed, 29 insertions(+)
>
> diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> index 732bd0ce9241..176831df8163 100644
> --- a/drivers/gpu/drm/drm_atomic_helper.c
> +++ b/drivers/gpu/drm/drm_atomic_helper.c
> @@ -2234,13 +2234,42 @@ EXPORT_SYMBOL(drm_atomic_helper_fake_vblank);
> */
> void drm_atomic_helper_commit_hw_done(struct drm_atomic_state *old_state)
> {
> + struct drm_connector *connector;
> + struct drm_connector_state *old_conn_state, *new_conn_state;
> struct drm_crtc *crtc;
> struct drm_crtc_state *old_crtc_state, *new_crtc_state;
> + struct drm_plane *plane;
> + struct drm_plane_state *old_plane_state, *new_plane_state;
> struct drm_crtc_commit *commit;
> + struct drm_private_obj *obj;
> + struct drm_private_state *old_obj_state, *new_obj_state;
> int i;
>
> + /*
> + * After this point, drivers should not access the permanent modeset
> + * state, so we also clear the new_state pointers to make this
> + * restriction explicit.
> + *
> + * For the CRTC state, we do this in the same loop where we signal
> + * hw_done, since we still need to new_crtc_state to fish out the
> + * commit.
> + */
> +
> + for_each_oldnew_connector_in_state(old_state, connector, old_conn_state, new_conn_state, i) {
> + old_state->connectors[i].new_state = NULL;
> + }
> +
> + for_each_oldnew_plane_in_state(old_state, plane, old_plane_state, new_plane_state, i) {
> + old_state->planes[i].new_state = NULL;
> + }
> +
> + for_each_oldnew_private_obj_in_state(old_state, obj, old_obj_state, new_obj_state, i) {
> + old_state->private_objs[i].new_state = NULL;
> + }
> +
> for_each_oldnew_crtc_in_state(old_state, crtc, old_crtc_state, new_crtc_state, i) {
> old_state->crtcs[i].new_self_refresh_active = new_crtc_state->self_refresh_active;
> + old_state->crtcs[i].new_state = NULL;

That's going to be a real PITA when doing programming after the fact from
a vblank worker. It's already a pain that the new_crtc_state->state is
getting NULLed somewhere.

>
> commit = new_crtc_state->commit;
> if (!commit)
> --
> 2.21.0
>
> _______________________________________________
> dri-devel mailing list
> [email protected]
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

--
Ville Syrj?l?
Intel

2019-11-01 19:49:16

by Rob Clark

[permalink] [raw]
Subject: Re: [PATCH 2/2] drm/atomic: clear new_state pointers at hw_done

On Fri, Nov 1, 2019 at 11:33 AM Maarten Lankhorst
<[email protected]> wrote:
>
> Op 01-11-2019 om 19:07 schreef Rob Clark:
> > From: Rob Clark <[email protected]>
> >
> > The new state should not be accessed after this point. Clear the
> > pointers to make that explicit.
> >
> > This makes the error corrected in the previous patch more obvious.
> >
> > Signed-off-by: Rob Clark <[email protected]>
>
> Would be nice if you could cc [email protected] next time, so I know our CI infrastructure is happy;
>
> It wouldn't surprise me if it catches 1 or 2 abuses in i915.
>
> Otherwise Reviewed-by: Maarten Lankhorst <[email protected]>
>
> Perhaps you could mail this to version to [email protected] using git-send-email so we at least get i915 results?

ok, I sent both patches to trybot, hopefully it tries them together,
as this patch without the self-refresh fix will definitely fall over

BR,
-R

2019-11-01 19:52:38

by Rob Clark

[permalink] [raw]
Subject: Re: [PATCH 2/2] drm/atomic: clear new_state pointers at hw_done

On Fri, Nov 1, 2019 at 12:25 PM Ville Syrjälä
<[email protected]> wrote:
>
> On Fri, Nov 01, 2019 at 11:07:13AM -0700, Rob Clark wrote:
> > From: Rob Clark <[email protected]>
> >
> > The new state should not be accessed after this point. Clear the
> > pointers to make that explicit.
> >
> > This makes the error corrected in the previous patch more obvious.
> >
> > Signed-off-by: Rob Clark <[email protected]>
> > ---
> > drivers/gpu/drm/drm_atomic_helper.c | 29 +++++++++++++++++++++++++++++
> > 1 file changed, 29 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> > index 732bd0ce9241..176831df8163 100644
> > --- a/drivers/gpu/drm/drm_atomic_helper.c
> > +++ b/drivers/gpu/drm/drm_atomic_helper.c
> > @@ -2234,13 +2234,42 @@ EXPORT_SYMBOL(drm_atomic_helper_fake_vblank);
> > */
> > void drm_atomic_helper_commit_hw_done(struct drm_atomic_state *old_state)
> > {
> > + struct drm_connector *connector;
> > + struct drm_connector_state *old_conn_state, *new_conn_state;
> > struct drm_crtc *crtc;
> > struct drm_crtc_state *old_crtc_state, *new_crtc_state;
> > + struct drm_plane *plane;
> > + struct drm_plane_state *old_plane_state, *new_plane_state;
> > struct drm_crtc_commit *commit;
> > + struct drm_private_obj *obj;
> > + struct drm_private_state *old_obj_state, *new_obj_state;
> > int i;
> >
> > + /*
> > + * After this point, drivers should not access the permanent modeset
> > + * state, so we also clear the new_state pointers to make this
> > + * restriction explicit.
> > + *
> > + * For the CRTC state, we do this in the same loop where we signal
> > + * hw_done, since we still need to new_crtc_state to fish out the
> > + * commit.
> > + */
> > +
> > + for_each_oldnew_connector_in_state(old_state, connector, old_conn_state, new_conn_state, i) {
> > + old_state->connectors[i].new_state = NULL;
> > + }
> > +
> > + for_each_oldnew_plane_in_state(old_state, plane, old_plane_state, new_plane_state, i) {
> > + old_state->planes[i].new_state = NULL;
> > + }
> > +
> > + for_each_oldnew_private_obj_in_state(old_state, obj, old_obj_state, new_obj_state, i) {
> > + old_state->private_objs[i].new_state = NULL;
> > + }
> > +
> > for_each_oldnew_crtc_in_state(old_state, crtc, old_crtc_state, new_crtc_state, i) {
> > old_state->crtcs[i].new_self_refresh_active = new_crtc_state->self_refresh_active;
> > + old_state->crtcs[i].new_state = NULL;
>
> That's going to be a real PITA when doing programming after the fact from
> a vblank worker. It's already a pain that the new_crtc_state->state is
> getting NULLed somewhere.
>

I think you already have that problem, this just makes it explicit.

BR,
-R

2019-11-01 20:10:20

by Sean Paul

[permalink] [raw]
Subject: Re: [PATCH 1/2] drm/atomic: fix self-refresh helpers crtc state dereference

On Fri, Nov 1, 2019 at 2:09 PM Rob Clark <[email protected]> wrote:
>
> From: Rob Clark <[email protected]>
>
> drm_self_refresh_helper_update_avg_times() was incorrectly accessing the
> new incoming state after drm_atomic_helper_commit_hw_done(). But this
> state might have already been superceeded by an !nonblock atomic update
> resulting in dereferencing an already free'd crtc_state.
>
> Fixes: d4da4e33341c ("drm: Measure Self Refresh Entry/Exit times to avoid thrashing")
> Cc: Sean Paul <[email protected]>
> Signed-off-by: Rob Clark <[email protected]>
> ---
> TODO I *think* this will more or less do the right thing.. althought I'm
> not 100% sure if, for example, we enter psr in a nonblock commit, and
> then leave psr in a !nonblock commit that overtakes the completion of
> the nonblock commit. Not sure if this sort of scenario can happen in
> practice. But not crashing is better than crashing, so I guess we
> should either take this patch or rever the self-refresh helpers until
> Sean can figure out a better solution.
>
> drivers/gpu/drm/drm_atomic_helper.c | 2 ++
> drivers/gpu/drm/drm_self_refresh_helper.c | 11 ++++++-----
> include/drm/drm_atomic.h | 8 ++++++++
> 3 files changed, 16 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> index 3ef2ac52ce94..732bd0ce9241 100644
> --- a/drivers/gpu/drm/drm_atomic_helper.c
> +++ b/drivers/gpu/drm/drm_atomic_helper.c
> @@ -2240,6 +2240,8 @@ void drm_atomic_helper_commit_hw_done(struct drm_atomic_state *old_state)
> int i;
>
> for_each_oldnew_crtc_in_state(old_state, crtc, old_crtc_state, new_crtc_state, i) {
> + old_state->crtcs[i].new_self_refresh_active = new_crtc_state->self_refresh_active;
> +
> commit = new_crtc_state->commit;
> if (!commit)
> continue;
> diff --git a/drivers/gpu/drm/drm_self_refresh_helper.c b/drivers/gpu/drm/drm_self_refresh_helper.c
> index 68f4765a5896..77b9079fa578 100644
> --- a/drivers/gpu/drm/drm_self_refresh_helper.c
> +++ b/drivers/gpu/drm/drm_self_refresh_helper.c
> @@ -143,19 +143,20 @@ void drm_self_refresh_helper_update_avg_times(struct drm_atomic_state *state,
> unsigned int commit_time_ms)
> {
> struct drm_crtc *crtc;
> - struct drm_crtc_state *old_crtc_state, *new_crtc_state;
> + struct drm_crtc_state *old_crtc_state;
> int i;
>
> - for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state,
> - new_crtc_state, i) {
> + for_each_old_crtc_in_state(state, crtc, old_crtc_state, i) {
> + bool new_self_refresh_active =
> + state->crtcs[i].new_self_refresh_active;
> struct drm_self_refresh_data *sr_data = crtc->self_refresh_data;
> struct ewma_psr_time *time;
>
> if (old_crtc_state->self_refresh_active ==
> - new_crtc_state->self_refresh_active)
> + new_self_refresh_active)
> continue;
>
> - if (new_crtc_state->self_refresh_active)
> + if (new_self_refresh_active)
> time = &sr_data->entry_avg_ms;
> else
> time = &sr_data->exit_avg_ms;
> diff --git a/include/drm/drm_atomic.h b/include/drm/drm_atomic.h
> index 927e1205d7aa..86baf2b38bb3 100644
> --- a/include/drm/drm_atomic.h
> +++ b/include/drm/drm_atomic.h
> @@ -155,6 +155,14 @@ struct __drm_crtcs_state {
> struct drm_crtc *ptr;
> struct drm_crtc_state *state, *old_state, *new_state;
>
> + /**
> + * @new_self_refresh_active:
> + *
> + * drm_atomic_helper_commit_hw_done() stashes new_crtc_state->self_refresh_active
> + * so that it can be accessed late in drm_self_refresh_helper_update_avg_times().
> + */
> + bool new_self_refresh_active;
> +

Instead of stashing this in crtc, we could generate a bitmask local to
commit_tail and pass that to calc_avg_times? That way we don't have to
worry about someone using this when they shouldn't

Sean

> /**
> * @commit:
> *
> --
> 2.21.0
>

2019-11-01 21:46:01

by Ville Syrjälä

[permalink] [raw]
Subject: Re: [PATCH 2/2] drm/atomic: clear new_state pointers at hw_done

On Fri, Nov 01, 2019 at 12:49:02PM -0700, Rob Clark wrote:
> On Fri, Nov 1, 2019 at 12:25 PM Ville Syrj?l?
> <[email protected]> wrote:
> >
> > On Fri, Nov 01, 2019 at 11:07:13AM -0700, Rob Clark wrote:
> > > From: Rob Clark <[email protected]>
> > >
> > > The new state should not be accessed after this point. Clear the
> > > pointers to make that explicit.
> > >
> > > This makes the error corrected in the previous patch more obvious.
> > >
> > > Signed-off-by: Rob Clark <[email protected]>
> > > ---
> > > drivers/gpu/drm/drm_atomic_helper.c | 29 +++++++++++++++++++++++++++++
> > > 1 file changed, 29 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> > > index 732bd0ce9241..176831df8163 100644
> > > --- a/drivers/gpu/drm/drm_atomic_helper.c
> > > +++ b/drivers/gpu/drm/drm_atomic_helper.c
> > > @@ -2234,13 +2234,42 @@ EXPORT_SYMBOL(drm_atomic_helper_fake_vblank);
> > > */
> > > void drm_atomic_helper_commit_hw_done(struct drm_atomic_state *old_state)
> > > {
> > > + struct drm_connector *connector;
> > > + struct drm_connector_state *old_conn_state, *new_conn_state;
> > > struct drm_crtc *crtc;
> > > struct drm_crtc_state *old_crtc_state, *new_crtc_state;
> > > + struct drm_plane *plane;
> > > + struct drm_plane_state *old_plane_state, *new_plane_state;
> > > struct drm_crtc_commit *commit;
> > > + struct drm_private_obj *obj;
> > > + struct drm_private_state *old_obj_state, *new_obj_state;
> > > int i;
> > >
> > > + /*
> > > + * After this point, drivers should not access the permanent modeset
> > > + * state, so we also clear the new_state pointers to make this
> > > + * restriction explicit.
> > > + *
> > > + * For the CRTC state, we do this in the same loop where we signal
> > > + * hw_done, since we still need to new_crtc_state to fish out the
> > > + * commit.
> > > + */
> > > +
> > > + for_each_oldnew_connector_in_state(old_state, connector, old_conn_state, new_conn_state, i) {
> > > + old_state->connectors[i].new_state = NULL;
> > > + }
> > > +
> > > + for_each_oldnew_plane_in_state(old_state, plane, old_plane_state, new_plane_state, i) {
> > > + old_state->planes[i].new_state = NULL;
> > > + }
> > > +
> > > + for_each_oldnew_private_obj_in_state(old_state, obj, old_obj_state, new_obj_state, i) {
> > > + old_state->private_objs[i].new_state = NULL;
> > > + }
> > > +
> > > for_each_oldnew_crtc_in_state(old_state, crtc, old_crtc_state, new_crtc_state, i) {
> > > old_state->crtcs[i].new_self_refresh_active = new_crtc_state->self_refresh_active;
> > > + old_state->crtcs[i].new_state = NULL;
> >
> > That's going to be a real PITA when doing programming after the fact from
> > a vblank worker. It's already a pain that the new_crtc_state->state is
> > getting NULLed somewhere.
> >
>
> I think you already have that problem, this just makes it explicit.

I don't yet. Except on a branch where I have my vblank workers.
And I think the only problem is having the helpers/core clobber
the pointers when it should not. I don't see why it can't just
leave them be and let me use them.

--
Ville Syrj?l?
Intel

2019-11-01 22:17:30

by Rob Clark

[permalink] [raw]
Subject: Re: [PATCH 2/2] drm/atomic: clear new_state pointers at hw_done

On Fri, Nov 1, 2019 at 2:44 PM Ville Syrjälä
<[email protected]> wrote:
>
> On Fri, Nov 01, 2019 at 12:49:02PM -0700, Rob Clark wrote:
> > On Fri, Nov 1, 2019 at 12:25 PM Ville Syrjälä
> > <[email protected]> wrote:
> > >
> > > On Fri, Nov 01, 2019 at 11:07:13AM -0700, Rob Clark wrote:
> > > > From: Rob Clark <[email protected]>
> > > >
> > > > The new state should not be accessed after this point. Clear the
> > > > pointers to make that explicit.
> > > >
> > > > This makes the error corrected in the previous patch more obvious.
> > > >
> > > > Signed-off-by: Rob Clark <[email protected]>
> > > > ---
> > > > drivers/gpu/drm/drm_atomic_helper.c | 29 +++++++++++++++++++++++++++++
> > > > 1 file changed, 29 insertions(+)
> > > >
> > > > diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> > > > index 732bd0ce9241..176831df8163 100644
> > > > --- a/drivers/gpu/drm/drm_atomic_helper.c
> > > > +++ b/drivers/gpu/drm/drm_atomic_helper.c
> > > > @@ -2234,13 +2234,42 @@ EXPORT_SYMBOL(drm_atomic_helper_fake_vblank);
> > > > */
> > > > void drm_atomic_helper_commit_hw_done(struct drm_atomic_state *old_state)
> > > > {
> > > > + struct drm_connector *connector;
> > > > + struct drm_connector_state *old_conn_state, *new_conn_state;
> > > > struct drm_crtc *crtc;
> > > > struct drm_crtc_state *old_crtc_state, *new_crtc_state;
> > > > + struct drm_plane *plane;
> > > > + struct drm_plane_state *old_plane_state, *new_plane_state;
> > > > struct drm_crtc_commit *commit;
> > > > + struct drm_private_obj *obj;
> > > > + struct drm_private_state *old_obj_state, *new_obj_state;
> > > > int i;
> > > >
> > > > + /*
> > > > + * After this point, drivers should not access the permanent modeset
> > > > + * state, so we also clear the new_state pointers to make this
> > > > + * restriction explicit.
> > > > + *
> > > > + * For the CRTC state, we do this in the same loop where we signal
> > > > + * hw_done, since we still need to new_crtc_state to fish out the
> > > > + * commit.
> > > > + */
> > > > +
> > > > + for_each_oldnew_connector_in_state(old_state, connector, old_conn_state, new_conn_state, i) {
> > > > + old_state->connectors[i].new_state = NULL;
> > > > + }
> > > > +
> > > > + for_each_oldnew_plane_in_state(old_state, plane, old_plane_state, new_plane_state, i) {
> > > > + old_state->planes[i].new_state = NULL;
> > > > + }
> > > > +
> > > > + for_each_oldnew_private_obj_in_state(old_state, obj, old_obj_state, new_obj_state, i) {
> > > > + old_state->private_objs[i].new_state = NULL;
> > > > + }
> > > > +
> > > > for_each_oldnew_crtc_in_state(old_state, crtc, old_crtc_state, new_crtc_state, i) {
> > > > old_state->crtcs[i].new_self_refresh_active = new_crtc_state->self_refresh_active;
> > > > + old_state->crtcs[i].new_state = NULL;
> > >
> > > That's going to be a real PITA when doing programming after the fact from
> > > a vblank worker. It's already a pain that the new_crtc_state->state is
> > > getting NULLed somewhere.
> > >
> >
> > I think you already have that problem, this just makes it explicit.
>
> I don't yet. Except on a branch where I have my vblank workers.
> And I think the only problem is having the helpers/core clobber
> the pointers when it should not. I don't see why it can't just
> leave them be and let me use them.
>

I guess it comes down to what assumptions you can make in driver
backend. But if you can, for example, move planes between crtcs, I
think you can't make assumptions about the order in which things
complete even if you don't have commits overtaking each other on a
single crtc..

BR,
-R

2019-11-04 09:31:31

by Maarten Lankhorst

[permalink] [raw]
Subject: Re: [PATCH 1/2] drm/atomic: fix self-refresh helpers crtc state dereference

Op 01-11-2019 om 21:06 schreef Sean Paul:
> On Fri, Nov 1, 2019 at 2:09 PM Rob Clark <[email protected]> wrote:
>> From: Rob Clark <[email protected]>
>>
>> drm_self_refresh_helper_update_avg_times() was incorrectly accessing the
>> new incoming state after drm_atomic_helper_commit_hw_done(). But this
>> state might have already been superceeded by an !nonblock atomic update
>> resulting in dereferencing an already free'd crtc_state.
>>
>> Fixes: d4da4e33341c ("drm: Measure Self Refresh Entry/Exit times to avoid thrashing")
>> Cc: Sean Paul <[email protected]>
>> Signed-off-by: Rob Clark <[email protected]>
>> ---
>> TODO I *think* this will more or less do the right thing.. althought I'm
>> not 100% sure if, for example, we enter psr in a nonblock commit, and
>> then leave psr in a !nonblock commit that overtakes the completion of
>> the nonblock commit. Not sure if this sort of scenario can happen in
>> practice. But not crashing is better than crashing, so I guess we
>> should either take this patch or rever the self-refresh helpers until
>> Sean can figure out a better solution.
>>
>> drivers/gpu/drm/drm_atomic_helper.c | 2 ++
>> drivers/gpu/drm/drm_self_refresh_helper.c | 11 ++++++-----
>> include/drm/drm_atomic.h | 8 ++++++++
>> 3 files changed, 16 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
>> index 3ef2ac52ce94..732bd0ce9241 100644
>> --- a/drivers/gpu/drm/drm_atomic_helper.c
>> +++ b/drivers/gpu/drm/drm_atomic_helper.c
>> @@ -2240,6 +2240,8 @@ void drm_atomic_helper_commit_hw_done(struct drm_atomic_state *old_state)
>> int i;
>>
>> for_each_oldnew_crtc_in_state(old_state, crtc, old_crtc_state, new_crtc_state, i) {
>> + old_state->crtcs[i].new_self_refresh_active = new_crtc_state->self_refresh_active;
>> +
>> commit = new_crtc_state->commit;
>> if (!commit)
>> continue;
>> diff --git a/drivers/gpu/drm/drm_self_refresh_helper.c b/drivers/gpu/drm/drm_self_refresh_helper.c
>> index 68f4765a5896..77b9079fa578 100644
>> --- a/drivers/gpu/drm/drm_self_refresh_helper.c
>> +++ b/drivers/gpu/drm/drm_self_refresh_helper.c
>> @@ -143,19 +143,20 @@ void drm_self_refresh_helper_update_avg_times(struct drm_atomic_state *state,
>> unsigned int commit_time_ms)
>> {
>> struct drm_crtc *crtc;
>> - struct drm_crtc_state *old_crtc_state, *new_crtc_state;
>> + struct drm_crtc_state *old_crtc_state;
>> int i;
>>
>> - for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state,
>> - new_crtc_state, i) {
>> + for_each_old_crtc_in_state(state, crtc, old_crtc_state, i) {
>> + bool new_self_refresh_active =
>> + state->crtcs[i].new_self_refresh_active;
>> struct drm_self_refresh_data *sr_data = crtc->self_refresh_data;
>> struct ewma_psr_time *time;
>>
>> if (old_crtc_state->self_refresh_active ==
>> - new_crtc_state->self_refresh_active)
>> + new_self_refresh_active)
>> continue;
>>
>> - if (new_crtc_state->self_refresh_active)
>> + if (new_self_refresh_active)
>> time = &sr_data->entry_avg_ms;
>> else
>> time = &sr_data->exit_avg_ms;
>> diff --git a/include/drm/drm_atomic.h b/include/drm/drm_atomic.h
>> index 927e1205d7aa..86baf2b38bb3 100644
>> --- a/include/drm/drm_atomic.h
>> +++ b/include/drm/drm_atomic.h
>> @@ -155,6 +155,14 @@ struct __drm_crtcs_state {
>> struct drm_crtc *ptr;
>> struct drm_crtc_state *state, *old_state, *new_state;
>>
>> + /**
>> + * @new_self_refresh_active:
>> + *
>> + * drm_atomic_helper_commit_hw_done() stashes new_crtc_state->self_refresh_active
>> + * so that it can be accessed late in drm_self_refresh_helper_update_avg_times().
>> + */
>> + bool new_self_refresh_active;
>> +
> Instead of stashing this in crtc, we could generate a bitmask local to
> commit_tail and pass that to calc_avg_times? That way we don't have to
> worry about someone using this when they shouldn't

Yeah would make sense to have a bitmask, instead of making the property special. :)

Current solution seems a bit ugly.


2019-11-04 18:44:38

by Ville Syrjälä

[permalink] [raw]
Subject: Re: [PATCH 2/2] drm/atomic: clear new_state pointers at hw_done

On Fri, Nov 01, 2019 at 03:14:09PM -0700, Rob Clark wrote:
> On Fri, Nov 1, 2019 at 2:44 PM Ville Syrj?l?
> <[email protected]> wrote:
> >
> > On Fri, Nov 01, 2019 at 12:49:02PM -0700, Rob Clark wrote:
> > > On Fri, Nov 1, 2019 at 12:25 PM Ville Syrj?l?
> > > <[email protected]> wrote:
> > > >
> > > > On Fri, Nov 01, 2019 at 11:07:13AM -0700, Rob Clark wrote:
> > > > > From: Rob Clark <[email protected]>
> > > > >
> > > > > The new state should not be accessed after this point. Clear the
> > > > > pointers to make that explicit.
> > > > >
> > > > > This makes the error corrected in the previous patch more obvious.
> > > > >
> > > > > Signed-off-by: Rob Clark <[email protected]>
> > > > > ---
> > > > > drivers/gpu/drm/drm_atomic_helper.c | 29 +++++++++++++++++++++++++++++
> > > > > 1 file changed, 29 insertions(+)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> > > > > index 732bd0ce9241..176831df8163 100644
> > > > > --- a/drivers/gpu/drm/drm_atomic_helper.c
> > > > > +++ b/drivers/gpu/drm/drm_atomic_helper.c
> > > > > @@ -2234,13 +2234,42 @@ EXPORT_SYMBOL(drm_atomic_helper_fake_vblank);
> > > > > */
> > > > > void drm_atomic_helper_commit_hw_done(struct drm_atomic_state *old_state)
> > > > > {
> > > > > + struct drm_connector *connector;
> > > > > + struct drm_connector_state *old_conn_state, *new_conn_state;
> > > > > struct drm_crtc *crtc;
> > > > > struct drm_crtc_state *old_crtc_state, *new_crtc_state;
> > > > > + struct drm_plane *plane;
> > > > > + struct drm_plane_state *old_plane_state, *new_plane_state;
> > > > > struct drm_crtc_commit *commit;
> > > > > + struct drm_private_obj *obj;
> > > > > + struct drm_private_state *old_obj_state, *new_obj_state;
> > > > > int i;
> > > > >
> > > > > + /*
> > > > > + * After this point, drivers should not access the permanent modeset
> > > > > + * state, so we also clear the new_state pointers to make this
> > > > > + * restriction explicit.
> > > > > + *
> > > > > + * For the CRTC state, we do this in the same loop where we signal
> > > > > + * hw_done, since we still need to new_crtc_state to fish out the
> > > > > + * commit.
> > > > > + */
> > > > > +
> > > > > + for_each_oldnew_connector_in_state(old_state, connector, old_conn_state, new_conn_state, i) {
> > > > > + old_state->connectors[i].new_state = NULL;
> > > > > + }
> > > > > +
> > > > > + for_each_oldnew_plane_in_state(old_state, plane, old_plane_state, new_plane_state, i) {
> > > > > + old_state->planes[i].new_state = NULL;
> > > > > + }
> > > > > +
> > > > > + for_each_oldnew_private_obj_in_state(old_state, obj, old_obj_state, new_obj_state, i) {
> > > > > + old_state->private_objs[i].new_state = NULL;
> > > > > + }
> > > > > +
> > > > > for_each_oldnew_crtc_in_state(old_state, crtc, old_crtc_state, new_crtc_state, i) {
> > > > > old_state->crtcs[i].new_self_refresh_active = new_crtc_state->self_refresh_active;
> > > > > + old_state->crtcs[i].new_state = NULL;
> > > >
> > > > That's going to be a real PITA when doing programming after the fact from
> > > > a vblank worker. It's already a pain that the new_crtc_state->state is
> > > > getting NULLed somewhere.
> > > >
> > >
> > > I think you already have that problem, this just makes it explicit.
> >
> > I don't yet. Except on a branch where I have my vblank workers.
> > And I think the only problem is having the helpers/core clobber
> > the pointers when it should not. I don't see why it can't just
> > leave them be and let me use them.
> >
>
> I guess it comes down to what assumptions you can make in driver
> backend. But if you can, for example, move planes between crtcs, I
> think you can't make assumptions about the order in which things
> complete even if you don't have commits overtaking each other on a
> single crtc..

IMO this whole notion of accessing new_crtc_state & co. being unsafe
for some reason is wrong. I think as long as I have the drm_atomic_state
I should be able to look at the new/old states within.

--
Ville Syrj?l?
Intel

2019-11-04 19:17:15

by Rob Clark

[permalink] [raw]
Subject: Re: [PATCH 2/2] drm/atomic: clear new_state pointers at hw_done

On Mon, Nov 4, 2019 at 10:42 AM Ville Syrjälä
<[email protected]> wrote:
>
> On Fri, Nov 01, 2019 at 03:14:09PM -0700, Rob Clark wrote:
> > On Fri, Nov 1, 2019 at 2:44 PM Ville Syrjälä
> > <[email protected]> wrote:
> > >
> > > On Fri, Nov 01, 2019 at 12:49:02PM -0700, Rob Clark wrote:
> > > > On Fri, Nov 1, 2019 at 12:25 PM Ville Syrjälä
> > > > <[email protected]> wrote:
> > > > >
> > > > > On Fri, Nov 01, 2019 at 11:07:13AM -0700, Rob Clark wrote:
> > > > > > From: Rob Clark <[email protected]>
> > > > > >
> > > > > > The new state should not be accessed after this point. Clear the
> > > > > > pointers to make that explicit.
> > > > > >
> > > > > > This makes the error corrected in the previous patch more obvious.
> > > > > >
> > > > > > Signed-off-by: Rob Clark <[email protected]>
> > > > > > ---
> > > > > > drivers/gpu/drm/drm_atomic_helper.c | 29 +++++++++++++++++++++++++++++
> > > > > > 1 file changed, 29 insertions(+)
> > > > > >
> > > > > > diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> > > > > > index 732bd0ce9241..176831df8163 100644
> > > > > > --- a/drivers/gpu/drm/drm_atomic_helper.c
> > > > > > +++ b/drivers/gpu/drm/drm_atomic_helper.c
> > > > > > @@ -2234,13 +2234,42 @@ EXPORT_SYMBOL(drm_atomic_helper_fake_vblank);
> > > > > > */
> > > > > > void drm_atomic_helper_commit_hw_done(struct drm_atomic_state *old_state)
> > > > > > {
> > > > > > + struct drm_connector *connector;
> > > > > > + struct drm_connector_state *old_conn_state, *new_conn_state;
> > > > > > struct drm_crtc *crtc;
> > > > > > struct drm_crtc_state *old_crtc_state, *new_crtc_state;
> > > > > > + struct drm_plane *plane;
> > > > > > + struct drm_plane_state *old_plane_state, *new_plane_state;
> > > > > > struct drm_crtc_commit *commit;
> > > > > > + struct drm_private_obj *obj;
> > > > > > + struct drm_private_state *old_obj_state, *new_obj_state;
> > > > > > int i;
> > > > > >
> > > > > > + /*
> > > > > > + * After this point, drivers should not access the permanent modeset
> > > > > > + * state, so we also clear the new_state pointers to make this
> > > > > > + * restriction explicit.
> > > > > > + *
> > > > > > + * For the CRTC state, we do this in the same loop where we signal
> > > > > > + * hw_done, since we still need to new_crtc_state to fish out the
> > > > > > + * commit.
> > > > > > + */
> > > > > > +
> > > > > > + for_each_oldnew_connector_in_state(old_state, connector, old_conn_state, new_conn_state, i) {
> > > > > > + old_state->connectors[i].new_state = NULL;
> > > > > > + }
> > > > > > +
> > > > > > + for_each_oldnew_plane_in_state(old_state, plane, old_plane_state, new_plane_state, i) {
> > > > > > + old_state->planes[i].new_state = NULL;
> > > > > > + }
> > > > > > +
> > > > > > + for_each_oldnew_private_obj_in_state(old_state, obj, old_obj_state, new_obj_state, i) {
> > > > > > + old_state->private_objs[i].new_state = NULL;
> > > > > > + }
> > > > > > +
> > > > > > for_each_oldnew_crtc_in_state(old_state, crtc, old_crtc_state, new_crtc_state, i) {
> > > > > > old_state->crtcs[i].new_self_refresh_active = new_crtc_state->self_refresh_active;
> > > > > > + old_state->crtcs[i].new_state = NULL;
> > > > >
> > > > > That's going to be a real PITA when doing programming after the fact from
> > > > > a vblank worker. It's already a pain that the new_crtc_state->state is
> > > > > getting NULLed somewhere.
> > > > >
> > > >
> > > > I think you already have that problem, this just makes it explicit.
> > >
> > > I don't yet. Except on a branch where I have my vblank workers.
> > > And I think the only problem is having the helpers/core clobber
> > > the pointers when it should not. I don't see why it can't just
> > > leave them be and let me use them.
> > >
> >
> > I guess it comes down to what assumptions you can make in driver
> > backend. But if you can, for example, move planes between crtcs, I
> > think you can't make assumptions about the order in which things
> > complete even if you don't have commits overtaking each other on a
> > single crtc..
>
> IMO this whole notion of accessing new_crtc_state & co. being unsafe
> for some reason is wrong. I think as long as I have the drm_atomic_state
> I should be able to look at the new/old states within.
>

accessing new state only works if you can guarantee the order in which
commits complete, which I don't think you can do in the general sense.
Like I said, it is perhaps an assumption that certain drivers can
make, although I'd probably advise against that on the grounds that it
is fragile.

BR,
-R

2019-11-04 20:53:31

by Ville Syrjälä

[permalink] [raw]
Subject: Re: [PATCH 2/2] drm/atomic: clear new_state pointers at hw_done

On Mon, Nov 04, 2019 at 11:13:59AM -0800, Rob Clark wrote:
> On Mon, Nov 4, 2019 at 10:42 AM Ville Syrj?l?
> <[email protected]> wrote:
> >
> > On Fri, Nov 01, 2019 at 03:14:09PM -0700, Rob Clark wrote:
> > > On Fri, Nov 1, 2019 at 2:44 PM Ville Syrj?l?
> > > <[email protected]> wrote:
> > > >
> > > > On Fri, Nov 01, 2019 at 12:49:02PM -0700, Rob Clark wrote:
> > > > > On Fri, Nov 1, 2019 at 12:25 PM Ville Syrj?l?
> > > > > <[email protected]> wrote:
> > > > > >
> > > > > > On Fri, Nov 01, 2019 at 11:07:13AM -0700, Rob Clark wrote:
> > > > > > > From: Rob Clark <[email protected]>
> > > > > > >
> > > > > > > The new state should not be accessed after this point. Clear the
> > > > > > > pointers to make that explicit.
> > > > > > >
> > > > > > > This makes the error corrected in the previous patch more obvious.
> > > > > > >
> > > > > > > Signed-off-by: Rob Clark <[email protected]>
> > > > > > > ---
> > > > > > > drivers/gpu/drm/drm_atomic_helper.c | 29 +++++++++++++++++++++++++++++
> > > > > > > 1 file changed, 29 insertions(+)
> > > > > > >
> > > > > > > diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> > > > > > > index 732bd0ce9241..176831df8163 100644
> > > > > > > --- a/drivers/gpu/drm/drm_atomic_helper.c
> > > > > > > +++ b/drivers/gpu/drm/drm_atomic_helper.c
> > > > > > > @@ -2234,13 +2234,42 @@ EXPORT_SYMBOL(drm_atomic_helper_fake_vblank);
> > > > > > > */
> > > > > > > void drm_atomic_helper_commit_hw_done(struct drm_atomic_state *old_state)
> > > > > > > {
> > > > > > > + struct drm_connector *connector;
> > > > > > > + struct drm_connector_state *old_conn_state, *new_conn_state;
> > > > > > > struct drm_crtc *crtc;
> > > > > > > struct drm_crtc_state *old_crtc_state, *new_crtc_state;
> > > > > > > + struct drm_plane *plane;
> > > > > > > + struct drm_plane_state *old_plane_state, *new_plane_state;
> > > > > > > struct drm_crtc_commit *commit;
> > > > > > > + struct drm_private_obj *obj;
> > > > > > > + struct drm_private_state *old_obj_state, *new_obj_state;
> > > > > > > int i;
> > > > > > >
> > > > > > > + /*
> > > > > > > + * After this point, drivers should not access the permanent modeset
> > > > > > > + * state, so we also clear the new_state pointers to make this
> > > > > > > + * restriction explicit.
> > > > > > > + *
> > > > > > > + * For the CRTC state, we do this in the same loop where we signal
> > > > > > > + * hw_done, since we still need to new_crtc_state to fish out the
> > > > > > > + * commit.
> > > > > > > + */
> > > > > > > +
> > > > > > > + for_each_oldnew_connector_in_state(old_state, connector, old_conn_state, new_conn_state, i) {
> > > > > > > + old_state->connectors[i].new_state = NULL;
> > > > > > > + }
> > > > > > > +
> > > > > > > + for_each_oldnew_plane_in_state(old_state, plane, old_plane_state, new_plane_state, i) {
> > > > > > > + old_state->planes[i].new_state = NULL;
> > > > > > > + }
> > > > > > > +
> > > > > > > + for_each_oldnew_private_obj_in_state(old_state, obj, old_obj_state, new_obj_state, i) {
> > > > > > > + old_state->private_objs[i].new_state = NULL;
> > > > > > > + }
> > > > > > > +
> > > > > > > for_each_oldnew_crtc_in_state(old_state, crtc, old_crtc_state, new_crtc_state, i) {
> > > > > > > old_state->crtcs[i].new_self_refresh_active = new_crtc_state->self_refresh_active;
> > > > > > > + old_state->crtcs[i].new_state = NULL;
> > > > > >
> > > > > > That's going to be a real PITA when doing programming after the fact from
> > > > > > a vblank worker. It's already a pain that the new_crtc_state->state is
> > > > > > getting NULLed somewhere.
> > > > > >
> > > > >
> > > > > I think you already have that problem, this just makes it explicit.
> > > >
> > > > I don't yet. Except on a branch where I have my vblank workers.
> > > > And I think the only problem is having the helpers/core clobber
> > > > the pointers when it should not. I don't see why it can't just
> > > > leave them be and let me use them.
> > > >
> > >
> > > I guess it comes down to what assumptions you can make in driver
> > > backend. But if you can, for example, move planes between crtcs, I
> > > think you can't make assumptions about the order in which things
> > > complete even if you don't have commits overtaking each other on a
> > > single crtc..
> >
> > IMO this whole notion of accessing new_crtc_state & co. being unsafe
> > for some reason is wrong. I think as long as I have the drm_atomic_state
> > I should be able to look at the new/old states within.
> >
>
> accessing new state only works if you can guarantee the order in which
> commits complete, which I don't think you can do in the general sense.

Doesn't feel like it should take a lot of rocket science to guarantee
the states get freed in the right order.

--
Ville Syrj?l?
Intel

2019-11-04 21:13:56

by Rob Clark

[permalink] [raw]
Subject: Re: [PATCH 2/2] drm/atomic: clear new_state pointers at hw_done

On Mon, Nov 4, 2019 at 12:50 PM Ville Syrjälä
<[email protected]> wrote:
>
> On Mon, Nov 04, 2019 at 11:13:59AM -0800, Rob Clark wrote:
> > On Mon, Nov 4, 2019 at 10:42 AM Ville Syrjälä
> > <[email protected]> wrote:
> > >
> > > On Fri, Nov 01, 2019 at 03:14:09PM -0700, Rob Clark wrote:
> > > > On Fri, Nov 1, 2019 at 2:44 PM Ville Syrjälä
> > > > <[email protected]> wrote:
> > > > >
> > > > > On Fri, Nov 01, 2019 at 12:49:02PM -0700, Rob Clark wrote:
> > > > > > On Fri, Nov 1, 2019 at 12:25 PM Ville Syrjälä
> > > > > > <[email protected]> wrote:
> > > > > > >
> > > > > > > On Fri, Nov 01, 2019 at 11:07:13AM -0700, Rob Clark wrote:
> > > > > > > > From: Rob Clark <[email protected]>
> > > > > > > >
> > > > > > > > The new state should not be accessed after this point. Clear the
> > > > > > > > pointers to make that explicit.
> > > > > > > >
> > > > > > > > This makes the error corrected in the previous patch more obvious.
> > > > > > > >
> > > > > > > > Signed-off-by: Rob Clark <[email protected]>
> > > > > > > > ---
> > > > > > > > drivers/gpu/drm/drm_atomic_helper.c | 29 +++++++++++++++++++++++++++++
> > > > > > > > 1 file changed, 29 insertions(+)
> > > > > > > >
> > > > > > > > diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> > > > > > > > index 732bd0ce9241..176831df8163 100644
> > > > > > > > --- a/drivers/gpu/drm/drm_atomic_helper.c
> > > > > > > > +++ b/drivers/gpu/drm/drm_atomic_helper.c
> > > > > > > > @@ -2234,13 +2234,42 @@ EXPORT_SYMBOL(drm_atomic_helper_fake_vblank);
> > > > > > > > */
> > > > > > > > void drm_atomic_helper_commit_hw_done(struct drm_atomic_state *old_state)
> > > > > > > > {
> > > > > > > > + struct drm_connector *connector;
> > > > > > > > + struct drm_connector_state *old_conn_state, *new_conn_state;
> > > > > > > > struct drm_crtc *crtc;
> > > > > > > > struct drm_crtc_state *old_crtc_state, *new_crtc_state;
> > > > > > > > + struct drm_plane *plane;
> > > > > > > > + struct drm_plane_state *old_plane_state, *new_plane_state;
> > > > > > > > struct drm_crtc_commit *commit;
> > > > > > > > + struct drm_private_obj *obj;
> > > > > > > > + struct drm_private_state *old_obj_state, *new_obj_state;
> > > > > > > > int i;
> > > > > > > >
> > > > > > > > + /*
> > > > > > > > + * After this point, drivers should not access the permanent modeset
> > > > > > > > + * state, so we also clear the new_state pointers to make this
> > > > > > > > + * restriction explicit.
> > > > > > > > + *
> > > > > > > > + * For the CRTC state, we do this in the same loop where we signal
> > > > > > > > + * hw_done, since we still need to new_crtc_state to fish out the
> > > > > > > > + * commit.
> > > > > > > > + */
> > > > > > > > +
> > > > > > > > + for_each_oldnew_connector_in_state(old_state, connector, old_conn_state, new_conn_state, i) {
> > > > > > > > + old_state->connectors[i].new_state = NULL;
> > > > > > > > + }
> > > > > > > > +
> > > > > > > > + for_each_oldnew_plane_in_state(old_state, plane, old_plane_state, new_plane_state, i) {
> > > > > > > > + old_state->planes[i].new_state = NULL;
> > > > > > > > + }
> > > > > > > > +
> > > > > > > > + for_each_oldnew_private_obj_in_state(old_state, obj, old_obj_state, new_obj_state, i) {
> > > > > > > > + old_state->private_objs[i].new_state = NULL;
> > > > > > > > + }
> > > > > > > > +
> > > > > > > > for_each_oldnew_crtc_in_state(old_state, crtc, old_crtc_state, new_crtc_state, i) {
> > > > > > > > old_state->crtcs[i].new_self_refresh_active = new_crtc_state->self_refresh_active;
> > > > > > > > + old_state->crtcs[i].new_state = NULL;
> > > > > > >
> > > > > > > That's going to be a real PITA when doing programming after the fact from
> > > > > > > a vblank worker. It's already a pain that the new_crtc_state->state is
> > > > > > > getting NULLed somewhere.
> > > > > > >
> > > > > >
> > > > > > I think you already have that problem, this just makes it explicit.
> > > > >
> > > > > I don't yet. Except on a branch where I have my vblank workers.
> > > > > And I think the only problem is having the helpers/core clobber
> > > > > the pointers when it should not. I don't see why it can't just
> > > > > leave them be and let me use them.
> > > > >
> > > >
> > > > I guess it comes down to what assumptions you can make in driver
> > > > backend. But if you can, for example, move planes between crtcs, I
> > > > think you can't make assumptions about the order in which things
> > > > complete even if you don't have commits overtaking each other on a
> > > > single crtc..
> > >
> > > IMO this whole notion of accessing new_crtc_state & co. being unsafe
> > > for some reason is wrong. I think as long as I have the drm_atomic_state
> > > I should be able to look at the new/old states within.
> > >
> >
> > accessing new state only works if you can guarantee the order in which
> > commits complete, which I don't think you can do in the general sense.
>
> Doesn't feel like it should take a lot of rocket science to guarantee
> the states get freed in the right order.
>

I agree, reference counting is not rocket science. But that isn't the
way drm core handles per-object state currently. Refcnt'ing is
probably the approach that I'd recommend if someone wanted to lift
this restriction about accessing new-state later.

BR,
-R

2019-11-06 03:02:34

by Chen, Rong A

[permalink] [raw]
Subject: [drm/atomic] 554231a5c5: BUG:kernel_NULL_pointer_dereference,address

FYI, we noticed the following commit (built with gcc-7):

commit: 554231a5c5757d32fe2a9ad7fa62f9201e83f12d ("[PATCH 2/2] drm/atomic: clear new_state pointers at hw_done")
url: https://github.com/0day-ci/linux/commits/Rob-Clark/drm-atomic-fix-self-refresh-helpers-crtc-state-dereference/20191103-045328


in testcase: xfstests
with following parameters:

disk: 4HDD
fs: btrfs
test: generic-group06

test-description: xfstests is a regression test suite for xfs and other files ystems.
test-url: git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git


on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 8G

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


+--------------------------------------------------------+------------+------------+
| | f450f81b77 | 554231a5c5 |
+--------------------------------------------------------+------------+------------+
| boot_successes | 15 | 2 |
| boot_failures | 0 | 25 |
| BUG:kernel_NULL_pointer_dereference,address | 0 | 24 |
| Oops:#[##] | 0 | 24 |
| RIP:drm_atomic_helper_wait_for_vblanks[drm_kms_helper] | 0 | 24 |
| Kernel_panic-not_syncing:Fatal_exception | 0 | 24 |
| BUG:kernel_hang_in_boot_stage | 0 | 1 |
+--------------------------------------------------------+------------+------------+


If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>


[ 13.638685] BUG: kernel NULL pointer dereference, address: 0000000000000009
[ 13.638688] #PF: supervisor read access in kernel mode
[ 13.638689] #PF: error_code(0x0000) - not-present page
[ 13.638690] PGD 0 P4D 0
[ 13.638693] Oops: 0000 [#1] SMP PTI
[ 13.638696] CPU: 1 PID: 240 Comm: systemd-udevd Not tainted 5.4.0-rc5-00351-g554231a5c5757 #1
[ 13.638698] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 13.638716] RIP: 0010:drm_atomic_helper_wait_for_vblanks+0x62/0x270 [drm_kms_helper]
[ 13.638718] Code: 00 00 49 89 f6 31 ed 45 31 ff 41 bd 01 00 00 00 49 8b 56 20 49 63 df 48 c1 e3 06 48 01 da 4c 8b 22 4d 85 e4 74 39 48 8b 52 18 <80> 7a 09 00 74 2f 4c 89 e7 e8 80 c8 dd ff 85 c0 75 1f 41 8b 8c 24
[ 13.638719] RSP: 0018:ffffb996803076e0 EFLAGS: 00010286
[ 13.638721] RAX: ffff903a17eac800 RBX: 0000000000000000 RCX: ffff903a17eac800
[ 13.638722] RDX: 0000000000000000 RSI: ffff9039daf28400 RDI: ffff903a17eac800
[ 13.638723] RBP: 0000000000000000 R08: ffff903a180b7928 R09: 0000000000000000
[ 13.638724] R10: 0000000000000020 R11: 0000000000000020 R12: ffff9039db049050
[ 13.638724] R13: 0000000000000001 R14: ffff9039daf28400 R15: 0000000000000000
[ 13.638726] FS: 00007efd423348c0(0000) GS:ffff903a3fd00000(0000) knlGS:0000000000000000
[ 13.638727] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 13.638728] CR2: 0000562193d79ff0 CR3: 0000000218414000 CR4: 00000000000406e0
[ 13.638735] Call Trace:
[ 13.638747] ? complete_all+0x31/0x40
[ 13.638756] drm_atomic_helper_commit_tail+0x52/0x60 [drm_kms_helper]
[ 13.638765] commit_tail+0xa5/0xb0 [drm_kms_helper]
[ 13.638772] drm_atomic_helper_commit+0xfc/0x110 [drm_kms_helper]
[ 13.638799] drm_client_modeset_commit_atomic+0x1c7/0x1e0 [drm]
[ 13.638816] drm_client_modeset_commit_force+0x55/0x170 [drm]
[ 13.638825] drm_fb_helper_restore_fbdev_mode_unlocked+0x45/0x90 [drm_kms_helper]
[ 13.638832] drm_fb_helper_set_par+0x29/0x50 [drm_kms_helper]
[ 13.638837] fbcon_init+0x37a/0x610
[ 13.638870] visual_init+0xce/0x130
[ 13.638873] do_bind_con_driver+0x205/0x410
[ 13.638876] do_take_over_console+0x76/0x1a0
[ 13.638879] do_fbcon_takeover+0x58/0xb0
[ 13.638882] register_framebuffer+0x224/0x330
[ 13.638891] __drm_fb_helper_initial_config_and_unlock+0x2f9/0x520 [drm_kms_helper]
[ 13.638899] drm_fbdev_client_hotplug+0xd8/0x1a0 [drm_kms_helper]
[ 13.638906] drm_fbdev_generic_setup+0x9b/0x110 [drm_kms_helper]
[ 13.638912] bochs_pci_probe+0x12d/0x150 [bochs_drm]
[ 13.638916] local_pci_probe+0x42/0x90
[ 13.638919] ? _cond_resched+0x19/0x30
[ 13.638922] pci_device_probe+0x10b/0x1c0
[ 13.638925] really_probe+0xef/0x420
[ 13.638928] driver_probe_device+0x110/0x120
[ 13.638930] device_driver_attach+0x4f/0x60
[ 13.638932] __driver_attach+0x9a/0x140
[ 13.638934] ? device_driver_attach+0x60/0x60
[ 13.638937] bus_for_each_dev+0x76/0xc0
[ 13.638940] ? klist_add_tail+0x3b/0x70
[ 13.638942] bus_add_driver+0x144/0x220
[ 13.638945] ? 0xffffffffc0521000
[ 13.638946] driver_register+0x5b/0xf0
[ 13.638948] ? 0xffffffffc0521000
[ 13.638951] do_one_initcall+0x46/0x214
[ 13.638956] ? free_unref_page_commit+0x9f/0x120
[ 13.638958] ? _cond_resched+0x19/0x30
[ 13.638961] ? kmem_cache_alloc_trace+0x3b/0x230
[ 13.638964] do_init_module+0x5b/0x223
[ 13.638969] load_module+0x1b8f/0x2010
[ 13.638973] ? ima_post_read_file+0xe2/0x120
[ 13.638977] ? __do_sys_finit_module+0xe9/0x110
[ 13.638979] __do_sys_finit_module+0xe9/0x110
[ 13.638983] do_syscall_64+0x5b/0x1d0
[ 13.638988] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 13.638990] RIP: 0033:0x7efd411b0469
[ 13.638992] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ff 49 2b 00 f7 d8 64 89 01 48
[ 13.638993] RSP: 002b:00007ffc5b64b208 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[ 13.638995] RAX: ffffffffffffffda RBX: 000055c5ed263260 RCX: 00007efd411b0469
[ 13.638996] RDX: 0000000000000000 RSI: 00007efd41ac9265 RDI: 0000000000000016
[ 13.638997] RBP: 00007efd41ac9265 R08: 0000000000000000 R09: 00007ffc5b64b780
[ 13.638998] R10: 0000000000000016 R11: 0000000000000246 R12: 0000000000000000
[ 13.638999] R13: 000055c5ed25e920 R14: 0000000000020000 R15: 000055c5ebbc1cbc
[ 13.639001] Modules linked in: ppdev bochs_drm(+) drm_vram_helper ttm drm_kms_helper snd_pcm snd_timer aesni_intel(+) ata_piix snd syscopyarea sysfillrect crypto_simd sysimgblt cryptd libata glue_helper fb_sys_fops drm soundcore joydev pcspkr serio_raw i2c_piix4 parport_pc parport floppy ip_tables
[ 13.639019] CR2: 0000000000000009
[ 13.639023] ---[ end trace 5adbcce539449ada ]---


To reproduce:

# build kernel
cd linux
cp config-5.4.0-rc5-00351-g554231a5c5757 .config
make HOSTCC=gcc-7 CC=gcc-7 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage modules
make HOSTCC=gcc-7 CC=gcc-7 ARCH=x86_64 INSTALL_MOD_PATH=<mod-install-dir> modules_install
cd <mod-install-dir>
find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz


git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email



Thanks,
Rong Chen


Attachments:
(No filename) (7.10 kB)
config-5.4.0-rc5-00351-g554231a5c5757 (203.86 kB)
job-script (5.53 kB)
dmesg.xz (14.54 kB)
Download all attachments