2006-08-19 23:51:26

by Solar Designer

[permalink] [raw]
Subject: [PATCH] loop.c: kernel_thread() retval check

Willy,

I propose the attached patch (extracted from 2.4.33-ow1) for inclusion
into 2.4.34-pre. (Last time I checked, 2.6 needed an equivalent fix,
but I haven't produced one yet.)

Basically, the code in drivers/block/loop.c did not check the return
value from kernel_thread(). If kernel_thread() would fail, the code
would misbehave (IIRC, the invoking process would become unkillable).

An easy way to trigger the bug was to run losetup under strace (as
root), and this is also how I tested the error path added with this
patch.

This change has been a part of publicly released -ow patches for 8+
months.

There are more instances of kernel_thread() calls that do not check the
return value; some of the remaining ones might need to be fixed, too.

Thanks,

Alexander


Attachments:
(No filename) (774.00 B)
linux-2.4.33-ow1-loop-kernel_thread-check.diff (901.00 B)
Download all attachments

2006-08-20 07:32:20

by Willy Tarreau

[permalink] [raw]
Subject: Re: [PATCH] loop.c: kernel_thread() retval check

Hi Alexander,

On Sun, Aug 20, 2006 at 03:46:29AM +0400, Solar Designer wrote:
> Willy,
>
> I propose the attached patch (extracted from 2.4.33-ow1) for inclusion
> into 2.4.34-pre. (Last time I checked, 2.6 needed an equivalent fix,
> but I haven't produced one yet.)
>
> Basically, the code in drivers/block/loop.c did not check the return
> value from kernel_thread(). If kernel_thread() would fail, the code
> would misbehave (IIRC, the invoking process would become unkillable).
>
> An easy way to trigger the bug was to run losetup under strace (as
> root), and this is also how I tested the error path added with this
> patch.

That's amazing it never got fixed before !
I still remembered this problem being discussed, and finally found
the thread :

http://lkml.org/lkml/2003/11/14/55

In fact, no code was proposed and 2.6 got fixed later, then stopped
using kernel_thread() so nearly nobody might have noticed it :

http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff_plain;h=3e88c17d404c5787afd5bd1763380317f5ccbf84;hp=22e6c1b39c648850438decd491f62d311800c7db

> This change has been a part of publicly released -ow patches for 8+
> months.
>
> There are more instances of kernel_thread() calls that do not check the
> return value; some of the remaining ones might need to be fixed, too.

I've read somewhere that the module loading code might be affected too.

> Thanks,
>
> Alexander

Thanks for this patch, I know this problem caught me at least one !

Cheers,
Willy


> diff -urpPX nopatch linux-2.4.33/drivers/block/loop.c linux/drivers/block/loop.c
> --- linux-2.4.33/drivers/block/loop.c Fri Jun 3 04:26:42 2005
> +++ linux/drivers/block/loop.c Sat Aug 12 08:51:47 2006
> @@ -693,12 +693,23 @@ static int loop_set_fd(struct loop_devic
> set_blocksize(dev, bs);
>
> lo->lo_bh = lo->lo_bhtail = NULL;
> - kernel_thread(loop_thread, lo, CLONE_FS | CLONE_FILES | CLONE_SIGHAND);
> - down(&lo->lo_sem);
> + error = kernel_thread(loop_thread, lo,
> + CLONE_FS | CLONE_FILES | CLONE_SIGHAND);
> + if (error < 0)
> + goto out_clr;
> + down(&lo->lo_sem); /* wait for the thread to start */
>
> fput(file);
> return 0;
>
> + out_clr:
> + lo->lo_backing_file = NULL;
> + lo->lo_device = 0;
> + lo->lo_flags = 0;
> + loop_sizes[lo->lo_number] = 0;
> + inode->i_mapping->gfp_mask = lo->old_gfp_mask;
> + lo->lo_state = Lo_unbound;
> + fput(file); /* yes, have to do it twice */
> out_putf:
> fput(file);
> out:

2006-08-20 14:36:35

by Solar Designer

[permalink] [raw]
Subject: Re: [PATCH] loop.c: kernel_thread() retval check

On Sun, Aug 20, 2006 at 09:21:48AM +0200, Willy Tarreau wrote:
> I still remembered this problem being discussed, and finally found
> the thread :
>
> http://lkml.org/lkml/2003/11/14/55

I was not aware that this had been discussed before. Bernhard (in the
old LKML posting above) seems to imply that having kernel_thread()
itself not fail on ptrace would be a sufficient fix, which I don't agree
with. There may be other reasons for kernel_thread() to fail, such as
the kernel running out of resources; with OpenVZ, kernel_thread() is not
allowed from within VEs.

> In fact, no code was proposed and 2.6 got fixed later, then stopped
> using kernel_thread() so nearly nobody might have noticed it :
>
> http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff_plain;h=3e88c17d404c5787afd5bd1763380317f5ccbf84;hp=22e6c1b39c648850438decd491f62d311800c7db

I'm afraid that this does not properly clean things up on error. I just
had a look at linux-2.6.17.9/drivers/block/loop.c - it still uses
kernel_thread() and has the same "goto out_putf" on error return from
kernel_thread(), which appears to not clean things up.

Alexander

2006-08-20 17:55:19

by Alan

[permalink] [raw]
Subject: Re: [PATCH] loop.c: kernel_thread() retval check

Ar Sul, 2006-08-20 am 03:46 +0400, ysgrifennodd Solar Designer:
> Willy,
>
> I propose the attached patch (extracted from 2.4.33-ow1) for inclusion
> into 2.4.34-pre. (Last time I checked, 2.6 needed an equivalent fix,
> but I haven't produced one yet.)

Acked-by: Alan Cox <[email protected]>

but should go into 2.6-mm and be tested in 2.6-mm before 2.4 backport.

2006-08-20 22:40:35

by Solar Designer

[permalink] [raw]
Subject: Re: [PATCH] loop.c: kernel_thread() retval check

Alan,

On Sun, Aug 20, 2006 at 07:15:17PM +0100, Alan Cox wrote:
> Acked-by: Alan Cox <[email protected]>

Thanks.

> but should go into 2.6-mm and be tested in 2.6-mm before 2.4 backport.

Unfortunately, I do not intend to prepare/test/submit a revision of the
patch for 2.6-mm, and I might only work on a 2.6 patch in a few months
from now. Unless someone else takes care of this, the fix is likely to
get dropped on the floor.

Do you really think that getting this fix into 2.4 should be delayed
like that? Please note that the patch affects error path only and that
without the patch things misbehave badly whenever the error occurs.
I really don't think the patch can make things worse, even if it were
buggy (although it's been tested with the error path triggered
specifically on 2.4).

Alexander

2006-08-20 22:50:56

by Alan

[permalink] [raw]
Subject: Re: [PATCH] loop.c: kernel_thread() retval check

Ar Llu, 2006-08-21 am 02:34 +0400, ysgrifennodd Solar Designer:
> Alan,
>
> On Sun, Aug 20, 2006 at 07:15:17PM +0100, Alan Cox wrote:
> > Acked-by: Alan Cox <[email protected]>
>
> Thanks.
>
> > but should go into 2.6-mm and be tested in 2.6-mm before 2.4 backport.
>
> Unfortunately, I do not intend to prepare/test/submit a revision of the
> patch for 2.6-mm, and I might only work on a 2.6 patch in a few months
> from now. Unless someone else takes care of this, the fix is likely to
> get dropped on the floor.
>
> Do you really think that getting this fix into 2.4 should be delayed
> like that?

I have no problem with it going into 2.4. Someone should forward port it
- but there should be plenty of people on the lists to do that.

2006-08-20 22:58:59

by Willy Tarreau

[permalink] [raw]
Subject: Re: [PATCH] loop.c: kernel_thread() retval check

On Mon, Aug 21, 2006 at 12:11:08AM +0100, Alan Cox wrote:
> Ar Llu, 2006-08-21 am 02:34 +0400, ysgrifennodd Solar Designer:
> > Do you really think that getting this fix into 2.4 should be delayed
> > like that?
>
> I have no problem with it going into 2.4. Someone should forward port it
> - but there should be plenty of people on the lists to do that.

That's so true ! We're relying too much on our own time while many people
might be waiting for some little work to start taking a look at the kernel
internals !

I think I will be starting to ask for forward porters for the fixed to 2.4
that need to be ported to 2.6 too.

Cheers,
Willy

2006-08-21 01:59:44

by Julio Auto

[permalink] [raw]
Subject: Re: [PATCH] loop.c: kernel_thread() retval check

On 8/20/06, Willy Tarreau <[email protected]> wrote:
> That's so true ! We're relying too much on our own time while many people
> might be waiting for some little work to start taking a look at the kernel
> internals !
>
> I think I will be starting to ask for forward porters for the fixed to 2.4
> that need to be ported to 2.6 too.

Well, actually I'd be glad to help. In fact, with this particular 2.4
patch at hand, fixing 2.6 seems incredbly straight-forward (or am I
getting ahead of myself?)
However, I wasn't able to reproduce the bug in my system just by
running losetup under strace. Maybe 2.6.15-1.2054_FC5 has it patched?
If if no one can help me out with this I'll boot into a stock kernel
later and try to trigger the bug again.

Cheers,

Julio Auto

2006-08-21 02:36:25

by Solar Designer

[permalink] [raw]
Subject: Re: [PATCH] loop.c: kernel_thread() retval check

On Sun, Aug 20, 2006 at 10:59:41PM -0300, Julio Auto wrote:
> On 8/20/06, Willy Tarreau <[email protected]> wrote:
> >I think I will be starting to ask for forward porters for the fixed to 2.4
> >that need to be ported to 2.6 too.
>
> Well, actually I'd be glad to help. In fact, with this particular 2.4
> patch at hand, fixing 2.6 seems incredbly straight-forward (or am I
> getting ahead of myself?)

You need to make sure that the cleanup code added with the patch matches
the loop device initialization preceding the kernel_thread() call. You
should not blindly take the cleanup code out of the 2.4 patch and apply
it to 2.6 - it might not be correct for 2.6.

> However, I wasn't able to reproduce the bug in my system just by
> running losetup under strace. Maybe 2.6.15-1.2054_FC5 has it patched?

No. But you won't be able to reproduce this with strace on 2.6 since
2.6's kernel_thread() uses CLONE_UNTRACED instead of failing on ptrace.
You'll probably need to temporarily replace the kernel_thread() call in
loop.c with -EAGAIN to comfortably test your cleanup code without
forcing the system to run out of resources.

Alexander

2006-08-21 02:47:32

by Julio Auto

[permalink] [raw]
Subject: Re: [PATCH] loop.c: kernel_thread() retval check

On 8/20/06, Solar Designer <[email protected]> wrote:
> You need to make sure that the cleanup code added with the patch matches
> the loop device initialization preceding the kernel_thread() call. You
> should not blindly take the cleanup code out of the 2.4 patch and apply
> it to 2.6 - it might not be correct for 2.6.

Yes, I already had that in mind, but thanks for the worry, anyway.

> No. But you won't be able to reproduce this with strace on 2.6 since
> 2.6's kernel_thread() uses CLONE_UNTRACED instead of failing on ptrace.
> You'll probably need to temporarily replace the kernel_thread() call in
> loop.c with -EAGAIN to comfortably test your cleanup code without
> forcing the system to run out of resources.

Thanks for the tip. I'll see what I can do. :)

Cheers,

Julio Auto