Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752615AbdI0JRf (ORCPT ); Wed, 27 Sep 2017 05:17:35 -0400 Received: from szxga05-in.huawei.com ([45.249.212.191]:7041 "EHLO szxga05-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752114AbdI0JRc (ORCPT ); Wed, 27 Sep 2017 05:17:32 -0400 Subject: Re: [Question] null pointer risk of kernel workqueue To: Tejun Heo References: <59C62398.6040101@huawei.com> <20170925152536.GL828415@devbig577.frc2.facebook.com> CC: , , Linuxarm , John Garry , From: tanxiaofei Message-ID: <59CB6C9C.7000205@huawei.com> Date: Wed, 27 Sep 2017 17:17:16 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: <20170925152536.GL828415@devbig577.frc2.facebook.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.74.185.74] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020203.59CB6CA6.0031,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: e72e2c208de53035b3b0a7eda39fbccd Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1520 Lines: 38 On 2017/9/25 23:25, Tejun Heo wrote: > Hello, > > On Sat, Sep 23, 2017 at 05:04:24PM +0800, tanxiaofei wrote: >> Hi Tejun & Jiangshan, >> >> I find an null pointer risk in the code of workqueue. Here is description: >> >> If draining, __queue_work() will call the function is_chained_work() to do some checks. >> In is_chained_work(), worker->current_pwq is used directly. It should be not safe. >> http://elixir.free-electrons.com/linux/latest/source/kernel/workqueue.c#L1384 >> >> If you check the thread function of this worker, worker_thread(), you will find worker->current_pwq >> is null when one work is done or ready to be processed. >> This issue may happen only if we queue work during executing drain_workqueue(). >> http://elixir.free-electrons.com/linux/latest/source/kernel/workqueue.c#L2173 > > Hmmm? I don't get it. worker->current_pwq is guaranteed to be set > while a work function is being executed and the chained check can only > get there iff the the worker is executing a work function. > Hi Tejun, Thanks for your quick reply. Hmm, but i think interrupt preemption could make this happen. For example, the scenario that all following conditions are met. 1.The handler of an interrupt call queue_work() to queue some works, and the workqueue is draining. 2.An worker thread is interrupted by this interrupt. 3.The worker thread is being executed, and at the very moment,worker->current_pwq is set to null.(This is possible and please check function worker_thread() carefully). Thanks.