On Wed, May 08, 2024 at 04:39:51PM +0800, Herbert Xu wrote:
> On Fri, Feb 09, 2024 at 01:43:42PM +0100, Damian Muszynski wrote:
> >
> > @@ -146,11 +147,19 @@ static void adf_device_reset_worker(struct work_struct *work)
> > adf_dev_restarted_notify(accel_dev);
> > clear_bit(ADF_STATUS_RESTARTING, &accel_dev->status);
> >
> > - /* The dev is back alive. Notify the caller if in sync mode */
> > - if (reset_data->mode == ADF_DEV_RESET_SYNC)
> > - complete(&reset_data->compl);
> > - else
> > + /*
> > + * The dev is back alive. Notify the caller if in sync mode
> > + *
> > + * If device restart will take a more time than expected,
> > + * the schedule_reset() function can timeout and exit. This can be
> > + * detected by calling the completion_done() function. In this case
> > + * the reset_data structure needs to be freed here.
> > + */
> > + if (reset_data->mode == ADF_DEV_RESET_ASYNC ||
> > + completion_done(&reset_data->compl))
> > kfree(reset_data);
> > + else
> > + complete(&reset_data->compl);
>
> This doesn't work because until you call complete, completion_done
> will always return false. IOW we now have a memory leak instead of
> a UAF.
>
> ---8<---
> Using completion_done to determine whether the caller has gone
> away only works after a complete call. Furthermore it's still
> possible that the caller has not yet called wait_for_completion,
> resulting in another potential UAF.
>
> Fix this by making the caller use cancel_work_sync and then freeing
> the memory safely.
>
> Fixes: 7d42e097607c ("crypto: qat - resolve race condition during AER recovery")
> Cc: <[email protected]> #6.8+
> Signed-off-by: Herbert Xu <[email protected]>
Reviewed-by: Giovanni Cabiddu <[email protected]>
This is also present in 6.6+ and 6.7+.
--
Giovanni