Date: Fri, 14 Aug 2015 17:53:32 +0200
From: Phil Sutter <phil@nwl.cc>
To: linux-kernel@vger.kernel.org
Subject: Re: kthreads: sporadic NULL pointer dereference in exit_creds()
Message-ID: <20150814155332.GA13635@orbit.nwl.cc>
References: <20150812150931.GE32353@orbit.nwl.cc>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20150812150931.GE32353@orbit.nwl.cc>
User-Agent: Mutt/1.5.23 (2014-03-12)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2568
Lines: 78

Hi,

I found the problem, it was a bug in my own code. For details see below:

On Wed, Aug 12, 2015 at 05:09:31PM +0200, Phil Sutter wrote:
[...]
> Here is the reproducer code (kthread_test.c) I used:
> 
> -----------------------------------8<-----------------------------------
> #include <linux/kthread.h>
> #include <linux/module.h>
> #include <linux/semaphore.h>
> #include <linux/slab.h>
> 
> static int tcount = 100;
> module_param(tcount, int, 0);
> MODULE_PARM_DESC(tcount, "Number of threads to spawn (default: 100)");
> 
> static struct semaphore prestart_sem;
> static struct semaphore startup_sem = __SEMAPHORE_INITIALIZER(startup_sem, 0);
> 
> static int threadfunc(void *unused)
> {
> 	up(&prestart_sem);
> 	if (down_interruptible(&startup_sem))
> 		pr_warn("thread: down_interruptible failed!\n");
> 	printk(KERN_INFO "thread: running\n");
> 	return 0;
> }

This function exits without waiting for kthread_should_stop() to return
true, which allows for do_fork() to do the cleanup (inside
wait_for_vfork_done()).

> static int __init init_kthread_test(void)
> {
> 	struct task_struct **tsk;
> 	int i, err;
> 
> 	tsk = kzalloc(tcount * sizeof(struct task_struct *), GFP_KERNEL);
> 	sema_init(&prestart_sem, 1 - tcount);
> 
> 	printk(KERN_INFO "%s: starting test run\n", __func__);
> 
> 	for (i = 0; i < tcount; i++) {
> 		tsk[i] = kthread_run(threadfunc, NULL, "thread[%d]", i);
> 		if (IS_ERR(tsk[i]))
> 			pr_warn("%s: kthread_run failed for thread %d\n", __func__, i);
> 		else
> 			printk(KERN_INFO "%s: started thread at %p\n", __func__, tsk[i]);
> 	}
> 
> 	if (down_interruptible(&prestart_sem))
> 		pr_warn("%s: down_interruptible failed!\n", __func__);
> 	for (i = 0; i < tcount; i++)
> 		up(&startup_sem);
> 
> 	for (i = 0; i < tcount; i++) {
> 		if (IS_ERR(tsk[i]))
> 			continue;
> 		if ((err = kthread_stop(tsk[i])))
> 			pr_warn("%s: kthread_stop failed for thread %d: %d\n", __func__, i, err);

Calling kthread_stop() for a thread that has already returned by itself
then leads to the problem of calling exit_creds() twice.

I wonder a bit if this should be covered for or not, as the call to
__put_task_struct is protected by a usage counter in struct task_struct.

I fixed the issue by looping over schedule() at the end of threadfunc()
until kthread_should_stop() returns true.

Cheers, Phil
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/