2022-12-08 01:08:22

by John Moon

[permalink] [raw]
Subject: [PATCH] workqueue: Check for null pointer return from get_work_pwq()

We've encountered a kernel panic with the following stack trace:

-> ret_from_fork
-> kthread
-> worker_thread
-> process_one_work
-> pwq_dec_nr_in_flight
-> pwq_activate_inactive_work

The issue was narrowed down to a null pointer dereference within
pwq_activate_inactive_work() stemming from the return value of
get_work_pwq() which may return NULL, but was not checked for
null return prior to use.

While fixing the issue, other dereferences of get_work_pwq()'s
return value were found without a null check.

Add null pointer checks to the calling functions that need them.

Signed-off-by: John Moon <[email protected]>
---
kernel/workqueue.c | 17 +++++++++++++++--
1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 7cd5f5e7e0a1..5de0a2e1aeaa 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1162,6 +1162,9 @@ static void pwq_activate_inactive_work(struct work_struct *work)
{
struct pool_workqueue *pwq = get_work_pwq(work);

+ if (!pwq)
+ return;
+
trace_workqueue_activate_work(work);
if (list_empty(&pwq->pool->worklist))
pwq->pool->watchdog_ts = jiffies;
@@ -2030,8 +2033,12 @@ static void idle_worker_timeout(struct timer_list *t)
static void send_mayday(struct work_struct *work)
{
struct pool_workqueue *pwq = get_work_pwq(work);
- struct workqueue_struct *wq = pwq->wq;
+ struct workqueue_struct *wq;
+
+ if (!pwq)
+ return;

+ wq = pwq->wq;
lockdep_assert_held(&wq_mayday_lock);

if (!wq->rescuer)
@@ -2184,9 +2191,10 @@ __acquires(&pool->lock)
{
struct pool_workqueue *pwq = get_work_pwq(work);
struct worker_pool *pool = worker->pool;
- bool cpu_intensive = pwq->wq->flags & WQ_CPU_INTENSIVE;
+ bool cpu_intensive;
unsigned long work_data;
struct worker *collision;
+
#ifdef CONFIG_LOCKDEP
/*
* It is permissible to free the struct work_struct from
@@ -2199,6 +2207,11 @@ __acquires(&pool->lock)

lockdep_copy_map(&lockdep_map, &work->lockdep_map);
#endif
+ if (!pwq)
+ return;
+
+ cpu_intensive = pwq->wq->flags & WQ_CPU_INTENSIVE;
+
/* ensure we're on the correct CPU */
WARN_ON_ONCE(!(pool->flags & POOL_DISASSOCIATED) &&
raw_smp_processor_id() != pool->cpu);
--
2.17.1


2022-12-12 23:09:28

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH] workqueue: Check for null pointer return from get_work_pwq()

On Wed, Dec 07, 2022 at 04:53:44PM -0800, John Moon wrote:
> We've encountered a kernel panic with the following stack trace:
>
> -> ret_from_fork
> -> kthread
> -> worker_thread
> -> process_one_work
> -> pwq_dec_nr_in_flight
> -> pwq_activate_inactive_work
>
> The issue was narrowed down to a null pointer dereference within
> pwq_activate_inactive_work() stemming from the return value of
> get_work_pwq() which may return NULL, but was not checked for
> null return prior to use.
>
> While fixing the issue, other dereferences of get_work_pwq()'s
> return value were found without a null check.
>
> Add null pointer checks to the calling functions that need them.

At that point the work item must have pwq assigned - see insert_work(), so
this can't be the root cause. It's just papering over a bug somewhere else
(e.g. the work item got freed or written over somehow).

Thanks.

--
tejun