Hi:
The steal_locks() call in binfmt_elf.c is buggy. It steals locks from
a files entry whose reference was dropped much earlier. This allows it
to steal other process's locks.
The following patch calls steal_locks() earlier so that this does not
happen.
Cheers,
--
Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ )
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Hi:
The unshare_files patch to flush_old_exec() did not restore the original
state when exec_mmap fails. This patch fixes that.
At this point, I believe the unshare_files stuff should be fine from
a correctness point of view. However, there is still a performance
problem as every ELF exec call ends up dupliating the files structure
as well as walking through all file locks.
Cheers,
--
Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ )
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
On Fri, Aug 08, 2003 at 08:53:21PM +1000, herbert wrote:
>
> The steal_locks() call in binfmt_elf.c is buggy. It steals locks from
> a files entry whose reference was dropped much earlier. This allows it
> to steal other process's locks.
>
> The following patch calls steal_locks() earlier so that this does not
> happen.
My patch is buggy too. If a file is closed by another clone between
the two steal_locks calls the lock will again be lost. Fortunately
this much harder to trigger than the previous bug.
The following patch fixes that.
--
Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ )
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
On Sat, 9 Aug 2003, Herbert Xu wrote:
> Hi:
>
> The unshare_files patch to flush_old_exec() did not restore the original
> state when exec_mmap fails. This patch fixes that.
Indeed. This is still needed.
> At this point, I believe the unshare_files stuff should be fine from
> a correctness point of view. However, there is still a performance
> problem as every ELF exec call ends up dupliating the files structure
> as well as walking through all file locks.
Cheers,
Andreas.
On Sat, Aug 09, 2003 at 11:11:16AM +1000, herbert wrote:
>
> At this point, I believe the unshare_files stuff should be fine from
> a correctness point of view. However, there is still a performance
> problem as every ELF exec call ends up dupliating the files structure
> as well as walking through all file locks.
Here is the patch that ensures files is only duplicated when necessary.
--
Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ )
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Hello,
thanks a lot for analyzing this. Please see the patch I just posted unter
the subject:
[PATCH] 2.4.22-rc2 steal_locks and load_elf_binary cleanups
That patch fixes the bug in a slightly cleaner way.
On Sat, 9 Aug 2003, Herbert Xu wrote:
> On Fri, Aug 08, 2003 at 08:53:21PM +1000, herbert wrote:
> >
> > The steal_locks() call in binfmt_elf.c is buggy. It steals locks from
> > a files entry whose reference was dropped much earlier. This allows it
> > to steal other process's locks.
This makes sense.
> > The following patch calls steal_locks() earlier so that this does not
> > happen.
>
> My patch is buggy too. If a file is closed by another clone between
> the two steal_locks calls the lock will again be lost. Fortunately
> this much harder to trigger than the previous bug.
>
> The following patch fixes that.
>
On Sat, 9 Aug 2003, Herbert Xu wrote:
> On Fri, Aug 08, 2003 at 08:53:21PM +1000, herbert wrote:
> >
> > The steal_locks() call in binfmt_elf.c is buggy. It steals locks from
> > a files entry whose reference was dropped much earlier. This allows it
> > to steal other process's locks.
> >
> > The following patch calls steal_locks() earlier so that this does not
> > happen.
>
> My patch is buggy too. If a file is closed by another clone between
> the two steal_locks calls the lock will again be lost. Fortunately
> this much harder to trigger than the previous bug.
I think this is not a strict bug---this scenario is not covered by POSIX
in the first place. Unless lock stealing is done atomically with
unshare_files there is a window of oportunity between unshare_files() and
steal_locks(), so locks can still get lost.
Cheers,
Andreas.
On Sat, 9 Aug 2003, Herbert Xu wrote:
> On Sat, Aug 09, 2003 at 11:11:16AM +1000, herbert wrote:
> >
> > At this point, I believe the unshare_files stuff should be fine from
> > a correctness point of view. However, there is still a performance
> > problem as every ELF exec call ends up dupliating the files structure
> > as well as walking through all file locks.
>
> Here is the patch that ensures files is only duplicated when necessary.
This patch is correct but unnecessary: steal_locks already tests for this
condition.
Cheers,
Andreas.
On Sat, Aug 09, 2003 at 04:20:38AM +0200, Andreas Gruenbacher wrote:
> On Sat, 9 Aug 2003, Herbert Xu wrote:
>
> > On Sat, Aug 09, 2003 at 11:11:16AM +1000, herbert wrote:
> > >
> > > At this point, I believe the unshare_files stuff should be fine from
> > > a correctness point of view. However, there is still a performance
> > > problem as every ELF exec call ends up dupliating the files structure
> > > as well as walking through all file locks.
> >
> > Here is the patch that ensures files is only duplicated when necessary.
>
> This patch is correct but unnecessary: steal_locks already tests for this
> condition.
Yes but when you call unshare_files twice one of them will have to
copy.
--
Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ )
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
On Sat, Aug 09, 2003 at 04:04:53AM +0200, Andreas Gruenbacher wrote:
>
> > My patch is buggy too. If a file is closed by another clone between
> > the two steal_locks calls the lock will again be lost. Fortunately
> > this much harder to trigger than the previous bug.
>
> I think this is not a strict bug---this scenario is not covered by POSIX
> in the first place. Unless lock stealing is done atomically with
> unshare_files there is a window of oportunity between unshare_files() and
> steal_locks(), so locks can still get lost.
It's not a standard compliance issue. In this case the lock will never
be released and it will eventually lead to a crash when someone reads
/proc/locks.
--
Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ )
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Here is an accumulate patch of Herbert's and my changes.
I could see no reason for ftmp in flush_old_exec and load_elf_binary, so I
removed that. Please correct me if that is wrong.
Cheers,
Andreas.
On Fri, 8 Aug 2003, Herbert Xu wrote:
> Hi:
>
> The steal_locks() call in binfmt_elf.c is buggy. It steals locks from
> a files entry whose reference was dropped much earlier. This allows it
> to steal other process's locks.
>
> The following patch calls steal_locks() earlier so that this does not
> happen.
On Sat, 9 Aug 2003, Herbert Xu wrote:
> On Sat, Aug 09, 2003 at 04:20:38AM +0200, Andreas Gruenbacher wrote:
> > On Sat, 9 Aug 2003, Herbert Xu wrote:
> >
> > > On Sat, Aug 09, 2003 at 11:11:16AM +1000, herbert wrote:
> > > >
> > > > At this point, I believe the unshare_files stuff should be fine from
> > > > a correctness point of view. However, there is still a performance
> > > > problem as every ELF exec call ends up dupliating the files structure
> > > > as well as walking through all file locks.
> > >
> > > Here is the patch that ensures files is only duplicated when necessary.
> >
> > This patch is correct but unnecessary: steal_locks already tests for this
> > condition.
>
> Yes but when you call unshare_files twice one of them will have to
> copy.
I see---that happens through flush_old_exec.
On Sat, Aug 09, 2003 at 04:59:03AM +0200, Andreas Gruenbacher wrote:
>
> @@ -714,18 +715,16 @@ static int load_elf_binary(struct linux_
> elf_entry = load_elf_interp(&interp_elf_ex,
> interpreter,
> &interp_load_addr);
> -
> - allow_write_access(interpreter);
> - fput(interpreter);
> - kfree(elf_interpreter);
> -
> if (BAD_ADDR(elf_entry)) {
> printk(KERN_ERR "Unable to load interpreter\n");
> - kfree(elf_phdata);
> send_sig(SIGSEGV, current, 0);
> retval = -ENOEXEC; /* Nobody gets to see this, but.. */
> - goto out;
> + goto out_free_dentry;
> }
> +
> + allow_write_access(interpreter);
> + fput(interpreter);
> + kfree(elf_interpreter);
> }
This looks bad since you're past the point of no return.
--
Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ )
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
On Sat, 9 Aug 2003, Herbert Xu wrote:
> On Sat, Aug 09, 2003 at 04:04:53AM +0200, Andreas Gruenbacher wrote:
> >
> > > My patch is buggy too. If a file is closed by another clone between
> > > the two steal_locks calls the lock will again be lost. Fortunately
> > > this much harder to trigger than the previous bug.
> >
> > I think this is not a strict bug---this scenario is not covered by POSIX
> > in the first place. Unless lock stealing is done atomically with
> > unshare_files there is a window of oportunity between unshare_files() and
> > steal_locks(), so locks can still get lost.
>
> It's not a standard compliance issue. In this case the lock will never
> be released and it will eventually lead to a crash when someone reads
> /proc/locks.
I don't see how this would happen. Could you please elaborate?
On Sat, 9 Aug 2003, Herbert Xu wrote:
> On Sat, Aug 09, 2003 at 04:59:03AM +0200, Andreas Gruenbacher wrote:
> >
> > @@ -714,18 +715,16 @@ static int load_elf_binary(struct linux_
> > elf_entry = load_elf_interp(&interp_elf_ex,
> > interpreter,
> > &interp_load_addr);
> > -
> > - allow_write_access(interpreter);
> > - fput(interpreter);
> > - kfree(elf_interpreter);
> > -
> > if (BAD_ADDR(elf_entry)) {
> > printk(KERN_ERR "Unable to load interpreter\n");
> > - kfree(elf_phdata);
> > send_sig(SIGSEGV, current, 0);
> > retval = -ENOEXEC; /* Nobody gets to see this, but.. */
> > - goto out;
> > + goto out_free_dentry;
> > }
> > +
> > + allow_write_access(interpreter);
> > + fput(interpreter);
> > + kfree(elf_interpreter);
> > }
>
> This looks bad since you're past the point of no return.
This is an equivalence transformation except for the explicit
sys_close(elf_exec_fileno) in the unwind code, which would eventually
happen, anyway.
Here is an update ...
> On Sat, 9 Aug 2003, Herbert Xu wrote:
> > Yes but when you call unshare_files twice one of them will have to
> > copy.
On Sat, Aug 09, 2003 at 05:13:52AM +0200, Andreas Gruenbacher wrote:
> On Sat, 9 Aug 2003, Herbert Xu wrote:
>
> > On Sat, Aug 09, 2003 at 04:04:53AM +0200, Andreas Gruenbacher wrote:
> > >
> > > > My patch is buggy too. If a file is closed by another clone between
> > > > the two steal_locks calls the lock will again be lost. Fortunately
> > > > this much harder to trigger than the previous bug.
> > >
> > > I think this is not a strict bug---this scenario is not covered by POSIX
> > > in the first place. Unless lock stealing is done atomically with
> > > unshare_files there is a window of oportunity between unshare_files() and
> > > steal_locks(), so locks can still get lost.
> >
> > It's not a standard compliance issue. In this case the lock will never
> > be released and it will eventually lead to a crash when someone reads
> > /proc/locks.
>
> I don't see how this would happen. Could you please elaborate?
Suppose that A and B share current->files and fd has a POSIX lock on it.
A B
unshare_files
steal_locks
close(fd)
exec fails
steal_locks
put_files_struct
The close in B fails to release the lock as it has been stolen by the
new files structure. The second steal_locks sets the fl_owner back to
the original files structure which no longer has fd in it and hence can
never release that lock. The put_files_struct doesn't release the lock
either since it is now owned by the original file structure.
--
Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ )
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
On Sat, 2003-08-09 at 05:19, Herbert Xu wrote:
> On Sat, Aug 09, 2003 at 05:13:52AM +0200, Andreas Gruenbacher wrote:
> > On Sat, 9 Aug 2003, Herbert Xu wrote:
> >
> > > On Sat, Aug 09, 2003 at 04:04:53AM +0200, Andreas Gruenbacher wrote:
> > > >
> > > > > My patch is buggy too. If a file is closed by another clone between
> > > > > the two steal_locks calls the lock will again be lost. Fortunately
> > > > > this much harder to trigger than the previous bug.
> > > >
> > > > I think this is not a strict bug---this scenario is not covered by POSIX
> > > > in the first place. Unless lock stealing is done atomically with
> > > > unshare_files there is a window of oportunity between unshare_files() and
> > > > steal_locks(), so locks can still get lost.
> > >
> > > It's not a standard compliance issue. In this case the lock will never
> > > be released and it will eventually lead to a crash when someone reads
> > > /proc/locks.
> >
> > I don't see how this would happen. Could you please elaborate?
>
> Suppose that A and B share current->files and fd has a POSIX lock on it.
>
> A B
> unshare_files
> steal_locks
> close(fd)
> exec fails
> steal_locks
> put_files_struct
>
> The close in B fails to release the lock as it has been stolen by the
> new files structure. The second steal_locks sets the fl_owner back to
> the original files structure which no longer has fd in it and hence can
> never release that lock. The put_files_struct doesn't release the lock
> either since it is now owned by the original file structure.
In the patch I've sent there is no stealing back of locks, so that case
does not exist.
Cheers,
--
Andreas Gruenbacher <[email protected]>
SuSE Labs, SuSE Linux AG <http://www.suse.de/>
Hi,
On Sat, 2003-08-09 at 12:05, Herbert Xu wrote:
> On Sat, Aug 09, 2003 at 05:54:32AM +0200, Andreas Gruenbacher wrote:
> > On Sat, 9 Aug 2003, Andreas Gruenbacher wrote:
> >
> > > Here is an update ...
> > Do you agree that this is correct?
>
> It looks OK to me. However, I still think the BAD_ADDR change is
> unnecessary.
Very good, thanks. The BAD_ADDR change is indeed not required. It saves
a funtion call so I think we should keep it, but I don't mind so much.
Cheers,
--
Andreas Gruenbacher <[email protected]>
SuSE Labs, SuSE Linux AG <http://www.suse.de/>