Date: Fri, 6 Feb 2015 13:01:37 -0500
From: Tejun Heo <tj@kernel.org>
To: Rabin Vincent <rabin.vincent@axis.com>
Cc: Jesper Nilsson <jespern@axis.com>, linux-kernel@vger.kernel.org
Subject: Re: Soft lockups in __cancel_work_timer()
Message-ID: <20150206180137.GC10580@htj.dyndns.org>
References: <20150206171156.GA8942@axis.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20150206171156.GA8942@axis.com>
User-Agent: Mutt/1.5.23 (2014-03-12)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2571
Lines: 72

Hello,

On Fri, Feb 06, 2015 at 06:11:56PM +0100, Rabin Vincent wrote:
> task1                           task2                           worker
> 
>                                                                 add to busy hash
>                                                                 clear pending
>                                                                 exec work
> 
> __cancel_work_timer()
>  try_to_grab_pending()
>    get pending, return 0
>  set work cancelling
>  flush_work()
>    wait_for_completion()
> 
>                                                                 remove from busy_hash
> 
>                                 __cancel_work_timer()
>                                 while (forever) {
>                                   try_to_grab_pending()
>                                     return -ENOENT due to cancelling
>                                   flush_work
>                                     return false due to already gone
>                                 }
> 
> 
> On kernels with CONFIG_PREEMPT disabled, this causes a permanent soft
> lockup of the system.

Ah, drat.

> Adding a cond_resched() to the loop in cancel_delayed_work_sync(), like
> below, helps the simple !PREEMPT case, but does not completely fix the
> problem, because the problem can still be seen if the thread which sees
> the ENOENT has for example SCHED_FIFO scheduling class, both on systems
> with CONFIG_PREEMPT enabled and on !PREEMPT with the cond_resched().
> 
> We've seen this problem with the cancel_delayed_work() call in
> jffs2_sync_fs(), but I've attached a testing patch which forces the
> problem on current mainline without the need for jffs2.
...
> @@ -2741,6 +2741,9 @@ static bool __cancel_work_timer(struct work_struct *work, bool is_dwork)
>  		 */
>  		if (unlikely(ret == -ENOENT))
>  			flush_work(work);
> +
> +		if (ret < 0)
> +			cond_resched();
>  	} while (unlikely(ret < 0));

Well, an obvious thing to do would be

		if (unlikely(ret == -ENOENT)) {
			if (!flush_work(work))
				schedule_timeout(1);
		}

It was gonna block for the work item anyway so I don't think
schedule_timeout() there is a problem.  That said, given that we're
guaranteed to be able to dereference @work there, there probably is a
better way.  I'll think more about it.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/