2003-08-08 10:54:05

by Herbert Xu

[permalink] [raw]
Subject: [PATCH] 2.4: Fix steal_locks race

Hi:

The steal_locks() call in binfmt_elf.c is buggy. It steals locks from
a files entry whose reference was dropped much earlier. This allows it
to steal other process's locks.

The following patch calls steal_locks() earlier so that this does not
happen.

Cheers,
--
Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ )
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Attachments:
(No filename) (492.00 B)
p (770.00 B)
Download all attachments

2003-08-09 01:16:16

by Herbert Xu

[permalink] [raw]
Subject: [PATCH] 2.4: Restore current->files in flush_old_exec

Hi:

The unshare_files patch to flush_old_exec() did not restore the original
state when exec_mmap fails. This patch fixes that.

At this point, I believe the unshare_files stuff should be fine from
a correctness point of view. However, there is still a performance
problem as every ELF exec call ends up dupliating the files structure
as well as walking through all file locks.

Cheers,
--
Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ )
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Attachments:
(No filename) (614.00 B)
q (1.18 kB)
Download all attachments

2003-08-09 01:12:04

by Herbert Xu

[permalink] [raw]
Subject: Re: [PATCH] 2.4: Fix steal_locks race

On Fri, Aug 08, 2003 at 08:53:21PM +1000, herbert wrote:
>
> The steal_locks() call in binfmt_elf.c is buggy. It steals locks from
> a files entry whose reference was dropped much earlier. This allows it
> to steal other process's locks.
>
> The following patch calls steal_locks() earlier so that this does not
> happen.

My patch is buggy too. If a file is closed by another clone between
the two steal_locks calls the lock will again be lost. Fortunately
this much harder to trigger than the previous bug.

The following patch fixes that.
--
Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ )
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Attachments:
(No filename) (772.00 B)
p (806.00 B)
Download all attachments

2003-08-09 01:48:09

by Andreas Gruenbacher

[permalink] [raw]
Subject: Re: [PATCH] 2.4: Restore current->files in flush_old_exec

On Sat, 9 Aug 2003, Herbert Xu wrote:

> Hi:
>
> The unshare_files patch to flush_old_exec() did not restore the original
> state when exec_mmap fails. This patch fixes that.

Indeed. This is still needed.

> At this point, I believe the unshare_files stuff should be fine from
> a correctness point of view. However, there is still a performance
> problem as every ELF exec call ends up dupliating the files structure
> as well as walking through all file locks.

Cheers,
Andreas.

2003-08-09 01:49:28

by Herbert Xu

[permalink] [raw]
Subject: Re: [PATCH] 2.4: Restore current->files in flush_old_exec

On Sat, Aug 09, 2003 at 11:11:16AM +1000, herbert wrote:
>
> At this point, I believe the unshare_files stuff should be fine from
> a correctness point of view. However, there is still a performance
> problem as every ELF exec call ends up dupliating the files structure
> as well as walking through all file locks.

Here is the patch that ensures files is only duplicated when necessary.
--
Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ )
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Attachments:
(No filename) (615.00 B)
p (1.06 kB)
Download all attachments

2003-08-09 01:47:00

by Andreas Gruenbacher

[permalink] [raw]
Subject: Re: [PATCH] 2.4: Fix steal_locks race

Hello,

thanks a lot for analyzing this. Please see the patch I just posted unter
the subject:

[PATCH] 2.4.22-rc2 steal_locks and load_elf_binary cleanups

That patch fixes the bug in a slightly cleaner way.

On Sat, 9 Aug 2003, Herbert Xu wrote:

> On Fri, Aug 08, 2003 at 08:53:21PM +1000, herbert wrote:
> >
> > The steal_locks() call in binfmt_elf.c is buggy. It steals locks from
> > a files entry whose reference was dropped much earlier. This allows it
> > to steal other process's locks.

This makes sense.

> > The following patch calls steal_locks() earlier so that this does not
> > happen.
>
> My patch is buggy too. If a file is closed by another clone between
> the two steal_locks calls the lock will again be lost. Fortunately
> this much harder to trigger than the previous bug.
>
> The following patch fixes that.
>

2003-08-09 02:04:55

by Andreas Gruenbacher

[permalink] [raw]
Subject: Re: [PATCH] 2.4: Fix steal_locks race

On Sat, 9 Aug 2003, Herbert Xu wrote:

> On Fri, Aug 08, 2003 at 08:53:21PM +1000, herbert wrote:
> >
> > The steal_locks() call in binfmt_elf.c is buggy. It steals locks from
> > a files entry whose reference was dropped much earlier. This allows it
> > to steal other process's locks.
> >
> > The following patch calls steal_locks() earlier so that this does not
> > happen.
>
> My patch is buggy too. If a file is closed by another clone between
> the two steal_locks calls the lock will again be lost. Fortunately
> this much harder to trigger than the previous bug.

I think this is not a strict bug---this scenario is not covered by POSIX
in the first place. Unless lock stealing is done atomically with
unshare_files there is a window of oportunity between unshare_files() and
steal_locks(), so locks can still get lost.


Cheers,
Andreas.

2003-08-09 02:20:40

by Andreas Gruenbacher

[permalink] [raw]
Subject: Re: [PATCH] 2.4: Restore current->files in flush_old_exec

On Sat, 9 Aug 2003, Herbert Xu wrote:

> On Sat, Aug 09, 2003 at 11:11:16AM +1000, herbert wrote:
> >
> > At this point, I believe the unshare_files stuff should be fine from
> > a correctness point of view. However, there is still a performance
> > problem as every ELF exec call ends up dupliating the files structure
> > as well as walking through all file locks.
>
> Here is the patch that ensures files is only duplicated when necessary.

This patch is correct but unnecessary: steal_locks already tests for this
condition.


Cheers,
Andreas.

2003-08-09 02:53:48

by Herbert Xu

[permalink] [raw]
Subject: Re: [PATCH] 2.4: Restore current->files in flush_old_exec

On Sat, Aug 09, 2003 at 04:20:38AM +0200, Andreas Gruenbacher wrote:
> On Sat, 9 Aug 2003, Herbert Xu wrote:
>
> > On Sat, Aug 09, 2003 at 11:11:16AM +1000, herbert wrote:
> > >
> > > At this point, I believe the unshare_files stuff should be fine from
> > > a correctness point of view. However, there is still a performance
> > > problem as every ELF exec call ends up dupliating the files structure
> > > as well as walking through all file locks.
> >
> > Here is the patch that ensures files is only duplicated when necessary.
>
> This patch is correct but unnecessary: steal_locks already tests for this
> condition.

Yes but when you call unshare_files twice one of them will have to
copy.
--
Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ )
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2003-08-09 02:52:53

by Herbert Xu

[permalink] [raw]
Subject: Re: [PATCH] 2.4: Fix steal_locks race

On Sat, Aug 09, 2003 at 04:04:53AM +0200, Andreas Gruenbacher wrote:
>
> > My patch is buggy too. If a file is closed by another clone between
> > the two steal_locks calls the lock will again be lost. Fortunately
> > this much harder to trigger than the previous bug.
>
> I think this is not a strict bug---this scenario is not covered by POSIX
> in the first place. Unless lock stealing is done atomically with
> unshare_files there is a window of oportunity between unshare_files() and
> steal_locks(), so locks can still get lost.

It's not a standard compliance issue. In this case the lock will never
be released and it will eventually lead to a crash when someone reads
/proc/locks.
--
Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ )
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2003-08-09 02:59:12

by Andreas Gruenbacher

[permalink] [raw]
Subject: Re: [PATCH] 2.4: Fix steal_locks race

Here is an accumulate patch of Herbert's and my changes.

I could see no reason for ftmp in flush_old_exec and load_elf_binary, so I
removed that. Please correct me if that is wrong.

Cheers,
Andreas.


On Fri, 8 Aug 2003, Herbert Xu wrote:

> Hi:
>
> The steal_locks() call in binfmt_elf.c is buggy. It steals locks from
> a files entry whose reference was dropped much earlier. This allows it
> to steal other process's locks.
>
> The following patch calls steal_locks() earlier so that this does not
> happen.


Attachments:
unshare-files-fix.diff (4.58 kB)

2003-08-09 03:02:43

by Andreas Gruenbacher

[permalink] [raw]
Subject: Re: [PATCH] 2.4: Restore current->files in flush_old_exec

On Sat, 9 Aug 2003, Herbert Xu wrote:

> On Sat, Aug 09, 2003 at 04:20:38AM +0200, Andreas Gruenbacher wrote:
> > On Sat, 9 Aug 2003, Herbert Xu wrote:
> >
> > > On Sat, Aug 09, 2003 at 11:11:16AM +1000, herbert wrote:
> > > >
> > > > At this point, I believe the unshare_files stuff should be fine from
> > > > a correctness point of view. However, there is still a performance
> > > > problem as every ELF exec call ends up dupliating the files structure
> > > > as well as walking through all file locks.
> > >
> > > Here is the patch that ensures files is only duplicated when necessary.
> >
> > This patch is correct but unnecessary: steal_locks already tests for this
> > condition.
>
> Yes but when you call unshare_files twice one of them will have to
> copy.

I see---that happens through flush_old_exec.

2003-08-09 03:04:57

by Herbert Xu

[permalink] [raw]
Subject: Re: [PATCH] 2.4: Fix steal_locks race

On Sat, Aug 09, 2003 at 04:59:03AM +0200, Andreas Gruenbacher wrote:
>
> @@ -714,18 +715,16 @@ static int load_elf_binary(struct linux_
> elf_entry = load_elf_interp(&interp_elf_ex,
> interpreter,
> &interp_load_addr);
> -
> - allow_write_access(interpreter);
> - fput(interpreter);
> - kfree(elf_interpreter);
> -
> if (BAD_ADDR(elf_entry)) {
> printk(KERN_ERR "Unable to load interpreter\n");
> - kfree(elf_phdata);
> send_sig(SIGSEGV, current, 0);
> retval = -ENOEXEC; /* Nobody gets to see this, but.. */
> - goto out;
> + goto out_free_dentry;
> }
> +
> + allow_write_access(interpreter);
> + fput(interpreter);
> + kfree(elf_interpreter);
> }

This looks bad since you're past the point of no return.
--
Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ )
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2003-08-09 03:13:54

by Andreas Gruenbacher

[permalink] [raw]
Subject: Re: [PATCH] 2.4: Fix steal_locks race

On Sat, 9 Aug 2003, Herbert Xu wrote:

> On Sat, Aug 09, 2003 at 04:04:53AM +0200, Andreas Gruenbacher wrote:
> >
> > > My patch is buggy too. If a file is closed by another clone between
> > > the two steal_locks calls the lock will again be lost. Fortunately
> > > this much harder to trigger than the previous bug.
> >
> > I think this is not a strict bug---this scenario is not covered by POSIX
> > in the first place. Unless lock stealing is done atomically with
> > unshare_files there is a window of oportunity between unshare_files() and
> > steal_locks(), so locks can still get lost.
>
> It's not a standard compliance issue. In this case the lock will never
> be released and it will eventually lead to a crash when someone reads
> /proc/locks.

I don't see how this would happen. Could you please elaborate?

2003-08-09 03:17:18

by Andreas Gruenbacher

[permalink] [raw]
Subject: Re: [PATCH] 2.4: Fix steal_locks race

On Sat, 9 Aug 2003, Herbert Xu wrote:

> On Sat, Aug 09, 2003 at 04:59:03AM +0200, Andreas Gruenbacher wrote:
> >
> > @@ -714,18 +715,16 @@ static int load_elf_binary(struct linux_
> > elf_entry = load_elf_interp(&interp_elf_ex,
> > interpreter,
> > &interp_load_addr);
> > -
> > - allow_write_access(interpreter);
> > - fput(interpreter);
> > - kfree(elf_interpreter);
> > -
> > if (BAD_ADDR(elf_entry)) {
> > printk(KERN_ERR "Unable to load interpreter\n");
> > - kfree(elf_phdata);
> > send_sig(SIGSEGV, current, 0);
> > retval = -ENOEXEC; /* Nobody gets to see this, but.. */
> > - goto out;
> > + goto out_free_dentry;
> > }
> > +
> > + allow_write_access(interpreter);
> > + fput(interpreter);
> > + kfree(elf_interpreter);
> > }
>
> This looks bad since you're past the point of no return.

This is an equivalence transformation except for the explicit
sys_close(elf_exec_fileno) in the unwind code, which would eventually
happen, anyway.

2003-08-09 03:19:46

by Andreas Gruenbacher

[permalink] [raw]
Subject: Re: [PATCH] 2.4: Restore current->files in flush_old_exec

Here is an update ...

> On Sat, 9 Aug 2003, Herbert Xu wrote:
> > Yes but when you call unshare_files twice one of them will have to
> > copy.


Attachments:
unshare-files-fix.diff (4.71 kB)

2003-08-09 03:20:54

by Herbert Xu

[permalink] [raw]
Subject: Re: [PATCH] 2.4: Fix steal_locks race

On Sat, Aug 09, 2003 at 05:13:52AM +0200, Andreas Gruenbacher wrote:
> On Sat, 9 Aug 2003, Herbert Xu wrote:
>
> > On Sat, Aug 09, 2003 at 04:04:53AM +0200, Andreas Gruenbacher wrote:
> > >
> > > > My patch is buggy too. If a file is closed by another clone between
> > > > the two steal_locks calls the lock will again be lost. Fortunately
> > > > this much harder to trigger than the previous bug.
> > >
> > > I think this is not a strict bug---this scenario is not covered by POSIX
> > > in the first place. Unless lock stealing is done atomically with
> > > unshare_files there is a window of oportunity between unshare_files() and
> > > steal_locks(), so locks can still get lost.
> >
> > It's not a standard compliance issue. In this case the lock will never
> > be released and it will eventually lead to a crash when someone reads
> > /proc/locks.
>
> I don't see how this would happen. Could you please elaborate?

Suppose that A and B share current->files and fd has a POSIX lock on it.

A B
unshare_files
steal_locks
close(fd)
exec fails
steal_locks
put_files_struct

The close in B fails to release the lock as it has been stolen by the
new files structure. The second steal_locks sets the fl_owner back to
the original files structure which no longer has fd in it and hence can
never release that lock. The put_files_struct doesn't release the lock
either since it is now owned by the original file structure.
--
Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ )
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2003-08-09 03:28:40

by Andreas Gruenbacher

[permalink] [raw]
Subject: Re: [PATCH] 2.4: Fix steal_locks race

On Sat, 2003-08-09 at 05:19, Herbert Xu wrote:
> On Sat, Aug 09, 2003 at 05:13:52AM +0200, Andreas Gruenbacher wrote:
> > On Sat, 9 Aug 2003, Herbert Xu wrote:
> >
> > > On Sat, Aug 09, 2003 at 04:04:53AM +0200, Andreas Gruenbacher wrote:
> > > >
> > > > > My patch is buggy too. If a file is closed by another clone between
> > > > > the two steal_locks calls the lock will again be lost. Fortunately
> > > > > this much harder to trigger than the previous bug.
> > > >
> > > > I think this is not a strict bug---this scenario is not covered by POSIX
> > > > in the first place. Unless lock stealing is done atomically with
> > > > unshare_files there is a window of oportunity between unshare_files() and
> > > > steal_locks(), so locks can still get lost.
> > >
> > > It's not a standard compliance issue. In this case the lock will never
> > > be released and it will eventually lead to a crash when someone reads
> > > /proc/locks.
> >
> > I don't see how this would happen. Could you please elaborate?
>
> Suppose that A and B share current->files and fd has a POSIX lock on it.
>
> A B
> unshare_files
> steal_locks
> close(fd)
> exec fails
> steal_locks
> put_files_struct
>
> The close in B fails to release the lock as it has been stolen by the
> new files structure. The second steal_locks sets the fl_owner back to
> the original files structure which no longer has fd in it and hence can
> never release that lock. The put_files_struct doesn't release the lock
> either since it is now owned by the original file structure.

In the patch I've sent there is no stealing back of locks, so that case
does not exist.


Cheers,
--
Andreas Gruenbacher <[email protected]>
SuSE Labs, SuSE Linux AG <http://www.suse.de/>


2003-08-09 10:39:50

by Andreas Gruenbacher

[permalink] [raw]
Subject: Re: [PATCH] 2.4: Restore current->files in flush_old_exec

Hi,

On Sat, 2003-08-09 at 12:05, Herbert Xu wrote:
> On Sat, Aug 09, 2003 at 05:54:32AM +0200, Andreas Gruenbacher wrote:
> > On Sat, 9 Aug 2003, Andreas Gruenbacher wrote:
> >
> > > Here is an update ...
> > Do you agree that this is correct?
>
> It looks OK to me. However, I still think the BAD_ADDR change is
> unnecessary.

Very good, thanks. The BAD_ADDR change is indeed not required. It saves
a funtion call so I think we should keep it, but I don't mind so much.


Cheers,
--
Andreas Gruenbacher <[email protected]>
SuSE Labs, SuSE Linux AG <http://www.suse.de/>