2024-06-07 16:46:38

by Tim Van Patten

[permalink] [raw]
Subject: [PATCH] workqueue: Prevent delayed work UAF kernel panic

From: Tim Van Patten <[email protected]>

A kernel panic can be triggered by delayed work being queued to a
workqueue that has been destroyed (wq->cpu_pwq nullified).

Commit 33e3f0a3358b ("workqueue: Add a new flag to spot the potential
UAF error")added the flag __WQ_DESTROYING to help avoid this. However,
this solution still allows work to be queued if it's from the same
workqueue, even if the workqueue has been fully destroyed, which is
possible with queue_delayed_work().

1. queue_delayed_work()
2. destroy_workqueue()
3. [delayed work timer expires]: delayed_work_timer_fn()

To prevent kernel panics, check if the pwq and pwq->pool pointers are
valid before derefencing them, and discard the work if they're not.

Discarding all work once __WQ_DESTROYING has been set (including from
the same workqueue) causes breakage, so we must check the pointers
directly.

Signed-off-by: Tim Van Patten <[email protected]>
---

kernel/workqueue.c | 13 +++++++++++++
1 file changed, 13 insertions(+)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 003474c9a77d0..6bcd5605dbc8b 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2291,6 +2291,18 @@ static void __queue_work(int cpu, struct workqueue_struct *wq,
}

pwq = rcu_dereference(*per_cpu_ptr(wq->cpu_pwq, cpu));
+
+ /*
+ * Discard work queued to a destroyed wq.
+ * This must be checked while the rcu_read_lock() is
+ * held, so destroy_workqueue() cannot nullify wq->cpu_pwq while it's
+ * being accessed here.
+ */
+ if (WARN_ON_ONCE(!pwq || !pwq->pool)) {
+ pr_warn("workqueue %s: discarding work for destroyed wq\n", wq->name);
+ goto out_rcu_read_unlock;
+ }
+
pool = pwq->pool;

/*
@@ -2365,6 +2377,7 @@ static void __queue_work(int cpu, struct workqueue_struct *wq,

out:
raw_spin_unlock(&pool->lock);
+out_rcu_read_unlock:
rcu_read_unlock();
}

--
2.45.2.505.gda0bf45e8d-goog



2024-06-07 16:54:03

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH] workqueue: Prevent delayed work UAF kernel panic

Hello,

On Fri, Jun 07, 2024 at 10:46:11AM -0600, Tim Van Patten wrote:
...
> Commit 33e3f0a3358b ("workqueue: Add a new flag to spot the potential
> UAF error")added the flag __WQ_DESTROYING to help avoid this. However,
> this solution still allows work to be queued if it's from the same
> workqueue, even if the workqueue has been fully destroyed, which is
> possible with queue_delayed_work().
>
> 1. queue_delayed_work()
> 2. destroy_workqueue()
> 3. [delayed work timer expires]: delayed_work_timer_fn()
>
> To prevent kernel panics, check if the pwq and pwq->pool pointers are
> valid before derefencing them, and discard the work if they're not.

Nothing guarantees that they'd stay NULL after wq destruction, right?

> Discarding all work once __WQ_DESTROYING has been set (including from
> the same workqueue) causes breakage, so we must check the pointers
> directly.

There's only so much protection we can offer for buggy code and I'd much
prefer an approach where the overhead is in the destruction path rather than
the queueing path.

Thanks.

--
tejun

2024-06-07 17:35:14

by Tim Van Patten

[permalink] [raw]
Subject: Re: [PATCH] workqueue: Prevent delayed work UAF kernel panic

Hi,

On Fri, Jun 7, 2024 at 10:53 AM Tejun Heo <[email protected]> wrote:
> To prevent kernel panics, check if the pwq and pwq->pool pointers are
> > valid before derefencing them, and discard the work if they're not.
>
> Nothing guarantees that they'd stay NULL after wq destruction, right?

It doesn't appear it's possible to re-use a wq once it's been
destroyed, so I don't
think they can be re-initialized once they're NULL (nor the __WQ_DESTROYING
flag cleared). I could certainly be wrong here though.

> > Discarding all work once __WQ_DESTROYING has been set (including from
> > the same workqueue) causes breakage, so we must check the pointers
> > directly.
>
> There's only so much protection we can offer for buggy code and I'd much
> prefer an approach where the overhead is in the destruction path rather than
> the queueing path.

That's a good point about avoiding the overhead I hadn't given enough
consideration.

A fix in the destruction path would presumably require draining the work in
drain_workqueue() or discarding it in destroy_workqueue(). My naive
interpretation of
things would be to discard it, so the work isn't executed
preemptively, but I don't know
what the expectations are for delayed work regarding which is better:
do it early or don't
do it at all. As you've pointed out, since this is buggy code to begin
with, I don't think
there's any contract that needs to be adhered to super-closely, which
is why I'm leaning
towards the discard path.

Regardless, it doesn't appear we have a list of those delayed work
items available today.
Adding one should be possible, but that would include removing the work in
delayed_work_timer_fn() in the normal path, which would also add
overhead (and likely
more than the pointer checks, due to searching the list, etc.).

I think a better approach here would be to update
delayed_work_timer_fn() to check
__WQ_DESTROYING and discard all attempts to queue to a destroyed wq. I haven't
given this as much testing (I just kicked off a round of testing), but
it should help
reduce the overhead impact.

I'm far from an expert here, so any input is appreciated. Any thoughts
on this approach
instead?

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
--- a/kernel/workqueue.c (revision 4bd3ef41540b950cf63179be1967aac6d0937766)
+++ b/kernel/workqueue.c (date 1717780157589)
@@ -1925,6 +1925,17 @@
{
struct delayed_work *dwork = from_timer(dwork, t, timer);

+ /*
+ * Prevent a kernel panic by discarding work queued to a destroyed wq.
+ * This must be checked while the rcu_read_lock() is
+ * held, so destroy_workqueue() cannot nullify wq->cpu_pwq while it's
+ * being accessed here.
+ */
+ if (WARN_ON_ONCE(dwork->wq->flags & __WQ_DESTROYING)) {
+ pr_warn("workqueue %s: discarding work for destroyed wq\n", dwork->wq->name);
+ return;
+ }
+
/* should have been called from irqsafe timer with irq already off */
__queue_work(dwork->cpu, dwork->wq, &dwork->work);
}

2024-06-10 20:29:17

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH] workqueue: Prevent delayed work UAF kernel panic

Hello,

On Fri, Jun 07, 2024 at 11:33:43AM -0600, Tim Van Patten wrote:
> > Nothing guarantees that they'd stay NULL after wq destruction, right?
>
> It doesn't appear it's possible to re-use a wq once it's been
> destroyed, so I don't

Hmm... how not? It can be freed and then reallocated for whatever, right?

> think they can be re-initialized once they're NULL (nor the __WQ_DESTROYING
> flag cleared). I could certainly be wrong here though.
>
> > > Discarding all work once __WQ_DESTROYING has been set (including from
> > > the same workqueue) causes breakage, so we must check the pointers
> > > directly.
> >
> > There's only so much protection we can offer for buggy code and I'd much
> > prefer an approach where the overhead is in the destruction path rather than
> > the queueing path.
>
> That's a good point about avoiding the overhead I hadn't given enough
> consideration.
>
> A fix in the destruction path would presumably require draining the work
> in drain_workqueue() or discarding it in destroy_workqueue(). My naive
> interpretation of things would be to discard it, so the work isn't
> executed preemptively, but I don't know what the expectations are for
> delayed work regarding which is better: do it early or don't do it at all.
> As you've pointed out, since this is buggy code to begin with, I don't
> think there's any contract that needs to be adhered to super-closely,
> which is why I'm leaning towards the discard path.

Probably the safest thing to do would be letting destroy_workqueue() skip
destroying if it detects that the workqueue can't be drained correctly.
Obviously, there's no way to protect against queueing after destruction is
complete tho.

> Regardless, it doesn't appear we have a list of those delayed work items
> available today. Adding one should be possible, but that would include
> removing the work in delayed_work_timer_fn() in the normal path, which
> would also add overhead (and likely more than the pointer checks, due to
> searching the list, etc.).
>
> I think a better approach here would be to update delayed_work_timer_fn()
> to check __WQ_DESTROYING and discard all attempts to queue to a destroyed
> wq. I haven't given this as much testing (I just kicked off a round of
> testing), but it should help reduce the overhead impact.

I don't know. Isn't that really partial? The workqueue struct may have been
freed and reallocated for something else long ago. What does checking a flag
on the struct mean?

If you really want to add protection against delayed queueing, one possible
way is adding per-CPU counters which track the number of work items which
are being delayed and check that from destroy path and error out there.
However, ultimately, there's only so much we can do about use-after-free
bugs. It's C after all.

Thanks.

--
tejun