Hello, Al!
I have noticed that Mozilla Firefox gets stuck for seconds or minutes
on some sites, in particular on Facebook with Linux 3.7-rc1 and newer
mainline kernels. Disabling flash plugin fixes the delays.
This is a Fedora 17 system with SELinux enabled, on x86_64
architecture, with all updates, with LXDE desktop. It's not the
Fedora 16 system I mentioned before, it has never had LXDE login
problems due to replace_fd().
Bisecting lead me to the patch that introduced iterate_fd():
commit c3c073f808b22dfae15ef8412b6f7b998644139a
Author: Al Viro <[email protected]>
Date: Tue Aug 21 22:32:06 2012 -0400
new helper: iterate_fd()
iterates through the opened files in given descriptor table,
calling a supplied function; we stop once non-zero is returned.
Callback gets struct file *, descriptor number and const void *
argument passed to iterator. It is called with files->file_lock
held, so it is not allowed to block.
tty_io, netprio_cgroup and selinux flush_unauthorized_files()
converted to its use.
Signed-off-by: Al Viro <[email protected]>
I have found that reverting the changes to security/selinux/hooks.c is
sufficient to restore the correct behavior.
--
Regards,
Pavel Roskin
Quoting Pavel Roskin <[email protected]>:
> Hello, Al!
>
> I have noticed that Mozilla Firefox gets stuck for seconds or minutes
> on some sites, in particular on Facebook with Linux 3.7-rc1 and newer
> mainline kernels. Disabling flash plugin fixes the delays.
>
> This is a Fedora 17 system with SELinux enabled, on x86_64
> architecture, with all updates, with LXDE desktop. It's not the Fedora
> 16 system I mentioned before, it has never had LXDE login problems due
> to replace_fd().
>
> Bisecting lead me to the patch that introduced iterate_fd():
>
> commit c3c073f808b22dfae15ef8412b6f7b998644139a
> Author: Al Viro <[email protected]>
> Date: Tue Aug 21 22:32:06 2012 -0400
>
> new helper: iterate_fd()
>
> iterates through the opened files in given descriptor table,
> calling a supplied function; we stop once non-zero is returned.
> Callback gets struct file *, descriptor number and const void *
> argument passed to iterator. It is called with files->file_lock
> held, so it is not allowed to block.
>
> tty_io, netprio_cgroup and selinux flush_unauthorized_files()
> converted to its use.
>
> Signed-off-by: Al Viro <[email protected]>
>
> I have found that reverting the changes to security/selinux/hooks.c is
> sufficient to restore the correct behavior.
>
> --
> Regards,
> Pavel Roskin
I've made a bugzilla entry for the bug and put a preliminary patch there.
https://bugzilla.kernel.org/show_bug.cgi?id=50401
--
Regards,
Pavel Roskin
OMG this +1 -1 stuff is nuts...
iterate_fd passes fd+1 instead of the fd for the file? seriously?
ewwww. I see how this patch fixes it, but still, wouldn't some magic
or negative value for no entries found be better than having to
understand the crazy logics?
/me pokes Al.
On Sun, Nov 11, 2012 at 11:50 PM, Pavel Roskin <[email protected]> wrote:
> Quoting Pavel Roskin <[email protected]>:
>
>> Hello, Al!
>>
>> I have noticed that Mozilla Firefox gets stuck for seconds or minutes
>> on some sites, in particular on Facebook with Linux 3.7-rc1 and newer
>> mainline kernels. Disabling flash plugin fixes the delays.
>>
>> This is a Fedora 17 system with SELinux enabled, on x86_64
>> architecture, with all updates, with LXDE desktop. It's not the Fedora
>> 16 system I mentioned before, it has never had LXDE login problems due
>> to replace_fd().
>>
>> Bisecting lead me to the patch that introduced iterate_fd():
>>
>> commit c3c073f808b22dfae15ef8412b6f7b998644139a
>> Author: Al Viro <[email protected]>
>> Date: Tue Aug 21 22:32:06 2012 -0400
>>
>> new helper: iterate_fd()
>>
>> iterates through the opened files in given descriptor table,
>> calling a supplied function; we stop once non-zero is returned.
>> Callback gets struct file *, descriptor number and const void *
>> argument passed to iterator. It is called with files->file_lock
>> held, so it is not allowed to block.
>>
>> tty_io, netprio_cgroup and selinux flush_unauthorized_files()
>> converted to its use.
>>
>> Signed-off-by: Al Viro <[email protected]>
>>
>> I have found that reverting the changes to security/selinux/hooks.c is
>> sufficient to restore the correct behavior.
>>
>> --
>> Regards,
>> Pavel Roskin
>
>
> I've made a bugzilla entry for the bug and put a preliminary patch there.
> https://bugzilla.kernel.org/show_bug.cgi?id=50401
>
>
> --
> Regards,
> Pavel Roskin
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
Quoting Eric Paris <[email protected]>:
> OMG this +1 -1 stuff is nuts...
>
> iterate_fd passes fd+1 instead of the fd for the file? seriously?
> ewwww. I see how this patch fixes it, but still, wouldn't some magic
> or negative value for no entries found be better than having to
> understand the crazy logics?
I agree. -ENOENT is both magic and negative :)
I tend to think now that iterate_fd() should be rewritten before it's too
late and all its current users should be cross-checked against the
code prior to the introduction of iterate_fd().
--
Regards,
Pavel Roskin
On Mon, Nov 12, 2012 at 11:57 AM, Pavel Roskin <[email protected]> wrote:
> Quoting Eric Paris <[email protected]>:
>
>> OMG this +1 -1 stuff is nuts...
Ping, Al.
int iterate_fd(struct files_struct *files, unsigned n,
[snip]
while (!res && n < fdt->max_fds) {
file = rcu_dereference_check_fdtable(files, fdt->fd[n++]);
if (file)
res = f(p, file, n);
}
spin_unlock(&files->file_lock);
return res;
So we increment n (the file descriptor number) in the dereference,
then pass that (wrong) number to f().
Every single f() (including SELinux, the cause of this bug) returns
fd+1 (so now we are up by 2). Then all of the users of iterate fd
actually use fd-1 (which is wrong)
Why not have iterate_fd return -ENOENT on no entries and stop all of
the stupid games? We fix the real bug (the above function should do
the n++ after the f() call, and the interface is sane to design
against...
-Eric
On Fri, Nov 16, 2012 at 02:58:46PM -0500, Eric Paris wrote:
> On Mon, Nov 12, 2012 at 11:57 AM, Pavel Roskin <[email protected]> wrote:
> > Quoting Eric Paris <[email protected]>:
> >
> >> OMG this +1 -1 stuff is nuts...
>
> Ping, Al.
>
> int iterate_fd(struct files_struct *files, unsigned n,
> [snip]
> while (!res && n < fdt->max_fds) {
> file = rcu_dereference_check_fdtable(files, fdt->fd[n++]);
> if (file)
> res = f(p, file, n);
> }
> spin_unlock(&files->file_lock);
> return res;
>
> So we increment n (the file descriptor number) in the dereference,
> then pass that (wrong) number to f().
>
> Every single f() (including SELinux, the cause of this bug) returns
> fd+1 (so now we are up by 2). Then all of the users of iterate fd
> actually use fd-1 (which is wrong)
>
> Why not have iterate_fd return -ENOENT on no entries and stop all of
> the stupid games? We fix the real bug (the above function should do
> the n++ after the f() call, and the interface is sane to design
> against...
Because we might bloody well want to have "run some test on all opened
files, return the first error". And -ENOENT is quite possible one.
Moreover, -ENOENT for "everything's OK, keep going" would be really
weird.
The bug is real, but Pavel's patch is all wrong. The problem is in the
argument; we should pass descriptor number, not descriptor + 1. And fixing
that (in iterator_fd() itself) makes all callbacks work as they ought to.
PS: Pavel, the life is painful enough as it is, no need to involve BZ into
it. Next time you need to post a patch, please do just that, especially
when it's so short, OK?
On Fri, Nov 30, 2012 at 03:40:34AM +0000, Al Viro wrote:
> The bug is real, but Pavel's patch is all wrong. The problem is in the
> argument; we should pass descriptor number, not descriptor + 1. And fixing
> that (in iterator_fd() itself) makes all callbacks work as they ought to.
>
> PS: Pavel, the life is painful enough as it is, no need to involve BZ into
> it. Next time you need to post a patch, please do just that, especially
> when it's so short, OK?
See tonight's vfs.git#for-linus; the head is at commit a77cfcb.
Quoting Al Viro <[email protected]>:
> On Fri, Nov 30, 2012 at 03:40:34AM +0000, Al Viro wrote:
>
>> The bug is real, but Pavel's patch is all wrong. The problem is in the
>> argument; we should pass descriptor number, not descriptor + 1. And fixing
>> that (in iterator_fd() itself) makes all callbacks work as they ought to.
>>
>> PS: Pavel, the life is painful enough as it is, no need to involve BZ into
>> it. Next time you need to post a patch, please do just that, especially
>> when it's so short, OK?
>
> See tonight's vfs.git#for-linus; the head is at commit a77cfcb.
Works for me! Thank you! Sorry if I mishandled anything.
--
Regards,
Pavel Roskin