2019-12-31 15:04:34

by Dominik Brodowski

[permalink] [raw]
Subject: [PATCH] early init: open /dev/console with O_LARGEFILE

If force_o_largefile() is true, /dev/console used to be opened
with O_LARGEFILE. Retain that behaviour.

Fixes: 8243186f0cc7 ("fs: remove ksys_dup()")
Signed-off-by: Dominik Brodowski <[email protected]>

diff --git a/init/main.c b/init/main.c
index 1ecfd43ed464..d12777775cb0 100644
--- a/init/main.c
+++ b/init/main.c
@@ -1162,7 +1162,8 @@ void console_on_rootfs(void)
unsigned int i;

/* Open /dev/console in kernelspace, this should never fail */
- file = filp_open("/dev/console", O_RDWR, 0);
+ file = filp_open("/dev/console",
+ O_RDWR | (force_o_largefile() ? O_LARGEFILE : 0), 0);
if (IS_ERR(file))
goto err_out;


2019-12-31 15:57:28

by youling 257

[permalink] [raw]
Subject: Re: [PATCH] early init: open /dev/console with O_LARGEFILE

test this patch, still can see /system/bin/sh warning.
this patch no help for me.

2019-12-31 23:02 GMT+08:00, Dominik Brodowski <[email protected]>:
> If force_o_largefile() is true, /dev/console used to be opened
> with O_LARGEFILE. Retain that behaviour.
>
> Fixes: 8243186f0cc7 ("fs: remove ksys_dup()")
> Signed-off-by: Dominik Brodowski <[email protected]>
>
> diff --git a/init/main.c b/init/main.c
> index 1ecfd43ed464..d12777775cb0 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -1162,7 +1162,8 @@ void console_on_rootfs(void)
> unsigned int i;
>
> /* Open /dev/console in kernelspace, this should never fail */
> - file = filp_open("/dev/console", O_RDWR, 0);
> + file = filp_open("/dev/console",
> + O_RDWR | (force_o_largefile() ? O_LARGEFILE : 0), 0);
> if (IS_ERR(file))
> goto err_out;
>
>

2020-01-01 00:34:46

by Arvind Sankar

[permalink] [raw]
Subject: Re: [PATCH] early init: open /dev/console with O_LARGEFILE

On Tue, Dec 31, 2019 at 04:02:26PM +0100, Dominik Brodowski wrote:
> If force_o_largefile() is true, /dev/console used to be opened
> with O_LARGEFILE. Retain that behaviour.
>

One other thing that used to happen is fsnotify_open() -- I don't know
how that might affect this, but it seems like the only thing left that's
different.

2020-01-01 10:46:50

by Dominik Brodowski

[permalink] [raw]
Subject: Re: [PATCH] early init: open /dev/console with O_LARGEFILE

@youling 257: could you test the attached patch, please?

Thanks,
Dominik

On Tue, Dec 31, 2019 at 07:30:19PM -0500, Arvind Sankar wrote:
> On Tue, Dec 31, 2019 at 04:02:26PM +0100, Dominik Brodowski wrote:
> > If force_o_largefile() is true, /dev/console used to be opened
> > with O_LARGEFILE. Retain that behaviour.
> >
>
> One other thing that used to happen is fsnotify_open() -- I don't know
> how that might affect this, but it seems like the only thing left that's
> different.

Suggested-by: Arvind Sankar <[email protected]>
Signed-off-by: Dominik Brodowski <[email protected]>

diff --git a/init/main.c b/init/main.c
index d12777775cb0..3f4163046200 100644
--- a/init/main.c
+++ b/init/main.c
@@ -94,6 +94,7 @@
#include <linux/jump_label.h>
#include <linux/mem_encrypt.h>
#include <linux/file.h>
+#include <linux/fsnotify.h>

#include <asm/io.h>
#include <asm/bugs.h>
@@ -1166,6 +1167,7 @@ void console_on_rootfs(void)
O_RDWR | (force_o_largefile() ? O_LARGEFILE : 0), 0);
if (IS_ERR(file))
goto err_out;
+ fsnotify_open(file);

/* create stdin/stdout/stderr, this should never fail */
for (i = 0; i < 3; i++) {

2020-01-01 13:33:09

by youling 257

[permalink] [raw]
Subject: Re: [PATCH] early init: open /dev/console with O_LARGEFILE

Unfortunately, test this patch still no help, has system/bin/sh warning.

2020-01-01 18:43 GMT+08:00, Dominik Brodowski <[email protected]>:
> @youling 257: could you test the attached patch, please?
>
> Thanks,
> Dominik
>
> On Tue, Dec 31, 2019 at 07:30:19PM -0500, Arvind Sankar wrote:
>> On Tue, Dec 31, 2019 at 04:02:26PM +0100, Dominik Brodowski wrote:
>> > If force_o_largefile() is true, /dev/console used to be opened
>> > with O_LARGEFILE. Retain that behaviour.
>> >
>>
>> One other thing that used to happen is fsnotify_open() -- I don't know
>> how that might affect this, but it seems like the only thing left that's
>> different.
>
> Suggested-by: Arvind Sankar <[email protected]>
> Signed-off-by: Dominik Brodowski <[email protected]>
>
> diff --git a/init/main.c b/init/main.c
> index d12777775cb0..3f4163046200 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -94,6 +94,7 @@
> #include <linux/jump_label.h>
> #include <linux/mem_encrypt.h>
> #include <linux/file.h>
> +#include <linux/fsnotify.h>
>
> #include <asm/io.h>
> #include <asm/bugs.h>
> @@ -1166,6 +1167,7 @@ void console_on_rootfs(void)
> O_RDWR | (force_o_largefile() ? O_LARGEFILE : 0), 0);
> if (IS_ERR(file))
> goto err_out;
> + fsnotify_open(file);
>
> /* create stdin/stdout/stderr, this should never fail */
> for (i = 0; i < 3; i++) {
>

2020-01-01 17:31:18

by Dominik Brodowski

[permalink] [raw]
Subject: Re: [PATCH] early init: open /dev/console with O_LARGEFILE

On Wed, Jan 01, 2020 at 09:27:27PM +0800, youling 257 wrote:
> Unfortunately, test this patch still no help, has system/bin/sh warning.

Thanks for testing! Could you test one more try, please (on top of current
mainline)? It's mainly a revert, and then an attempt to get ksys_open()
right.

Thanks,
Dominik


NOTE: test patch only. Do not commit.

diff --git a/fs/file.c b/fs/file.c
index 2f4fcf985079..3da91a112bab 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -960,7 +960,7 @@ SYSCALL_DEFINE2(dup2, unsigned int, oldfd, unsigned int, newfd)
return ksys_dup3(oldfd, newfd, 0);
}

-SYSCALL_DEFINE1(dup, unsigned int, fildes)
+int ksys_dup(unsigned int fildes)
{
int ret = -EBADF;
struct file *file = fget_raw(fildes);
@@ -975,6 +975,11 @@ SYSCALL_DEFINE1(dup, unsigned int, fildes)
return ret;
}

+SYSCALL_DEFINE1(dup, unsigned int, fildes)
+{
+ return ksys_dup(fildes);
+}
+
int f_dupfd(unsigned int from, struct file *file, unsigned flags)
{
int err;
diff --git a/fs/open.c b/fs/open.c
index b62f5c0923a8..270767c9966b 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -1079,7 +1079,50 @@ struct file *file_open_root(struct dentry *dentry, struct vfsmount *mnt,
}
EXPORT_SYMBOL(file_open_root);

-long do_sys_open(int dfd, const char __user *filename, int flags, umode_t mode)
+/**
+ * ksys_open - open file and return fd
+ *
+ * @filename: path to open
+ * @flags: open flags as per the open(2) second argument
+ * @mode: mode for the new file if O_CREAT is set, else ignored
+ *
+ * This is the helper to open a file from kernelspace if you really
+ * have to. But in generally you should not do this, so please move
+ * along, nothing to see here..
+ */
+long ksys_open(const char *filename, int flags, umode_t mode)
+{
+ struct open_flags op;
+ struct filename *tmp;
+ int fd;
+
+ if (force_o_largefile())
+ flags |= O_LARGEFILE;
+
+ fd = build_open_flags(flags, mode, &op);
+ if (fd)
+ return fd;
+
+ tmp = getname_kernel(filename);
+ if (IS_ERR(tmp))
+ return PTR_ERR(tmp);
+
+ fd = get_unused_fd_flags(flags);
+ if (fd >= 0) {
+ struct file *f = do_filp_open(AT_FDCWD, tmp, &op);
+ if (IS_ERR(f)) {
+ put_unused_fd(fd);
+ fd = PTR_ERR(f);
+ } else {
+ fsnotify_open(f);
+ fd_install(fd, f);
+ }
+ }
+ putname(tmp);
+ return fd;
+}
+
+static long do_sys_open(int dfd, const char __user *filename, int flags, umode_t mode)
{
struct open_flags op;
int fd = build_open_flags(flags, mode, &op);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 98e0349adb52..8ffe1fd6b6dc 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2528,8 +2528,6 @@ extern int do_truncate(struct dentry *, loff_t start, unsigned int time_attrs,
struct file *filp);
extern int vfs_fallocate(struct file *file, int mode, loff_t offset,
loff_t len);
-extern long do_sys_open(int dfd, const char __user *filename, int flags,
- umode_t mode);
extern struct file *file_open_name(struct filename *, int, umode_t);
extern struct file *filp_open(const char *, int, umode_t);
extern struct file *file_open_root(struct dentry *, struct vfsmount *,
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 2960dedcfde8..6aa988e66e7b 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -1232,6 +1232,7 @@ asmlinkage long sys_ni_syscall(void);
*/

int ksys_umount(char __user *name, int flags);
+int ksys_dup(unsigned int fildes);
int ksys_chroot(const char __user *filename);
ssize_t ksys_write(unsigned int fd, const char __user *buf, size_t count);
int ksys_chdir(const char __user *filename);
@@ -1371,16 +1372,7 @@ static inline int ksys_close(unsigned int fd)
return __close_fd(current->files, fd);
}

-extern long do_sys_open(int dfd, const char __user *filename, int flags,
- umode_t mode);
-
-static inline long ksys_open(const char __user *filename, int flags,
- umode_t mode)
-{
- if (force_o_largefile())
- flags |= O_LARGEFILE;
- return do_sys_open(AT_FDCWD, filename, flags, mode);
-}
+extern long ksys_open(const char *filename, int flags, umode_t mode);

extern long do_sys_truncate(const char __user *pathname, loff_t length);

diff --git a/init/main.c b/init/main.c
index 1ecfd43ed464..72aa063bf29e 100644
--- a/init/main.c
+++ b/init/main.c
@@ -93,7 +93,6 @@
#include <linux/rodata_test.h>
#include <linux/jump_label.h>
#include <linux/mem_encrypt.h>
-#include <linux/file.h>

#include <asm/io.h>
#include <asm/bugs.h>
@@ -1158,26 +1157,13 @@ static int __ref kernel_init(void *unused)

void console_on_rootfs(void)
{
- struct file *file;
- unsigned int i;
-
- /* Open /dev/console in kernelspace, this should never fail */
- file = filp_open("/dev/console", O_RDWR, 0);
- if (IS_ERR(file))
- goto err_out;
-
- /* create stdin/stdout/stderr, this should never fail */
- for (i = 0; i < 3; i++) {
- if (f_dupfd(i, file, 0) != i)
- goto err_out;
- }
-
- return;
+ /* Open the /dev/console as stdin, this should never fail */
+ if (ksys_open("/dev/console", O_RDWR, 0) < 0)
+ pr_err("Warning: unable to open an initial console.\n");

-err_out:
- /* no panic -- this might not be fatal */
- pr_err("Warning: unable to open an initial console.\n");
- return;
+ /* create stdout/stderr */
+ (void) ksys_dup(0);
+ (void) ksys_dup(0);
}

static noinline void __init kernel_init_freeable(void)

2020-01-01 18:03:59

by Arvind Sankar

[permalink] [raw]
Subject: Re: [PATCH] early init: open /dev/console with O_LARGEFILE

On Wed, Jan 01, 2020 at 09:27:27PM +0800, youling 257 wrote:
> Unfortunately, test this patch still no help, has system/bin/sh warning.
>

Just to confirm, the only change needed to make the warning go away is
reverting the single commit 8243186f0cc7 ("fs: remove ksys_dup()")?

I don't get how that thing, even if it gets something wrong, impacts the
shell on a virtual console, which should be a long way downstream from
the init process that got handed /dev/console.

2020-01-01 18:35:02

by Arvind Sankar

[permalink] [raw]
Subject: Re: [PATCH] early init: open /dev/console with O_LARGEFILE

On Tue, Dec 31, 2019 at 07:30:19PM -0500, Arvind Sankar wrote:
> On Tue, Dec 31, 2019 at 04:02:26PM +0100, Dominik Brodowski wrote:
> > If force_o_largefile() is true, /dev/console used to be opened
> > with O_LARGEFILE. Retain that behaviour.
> >
>
> One other thing that used to happen is fsnotify_open() -- I don't know
> how that might affect this, but it seems like the only thing left that's
> different.

Also, this shouldn't impact the current issue I think, but won't doing
filp_open followed by 3 f_dupfd's cause the file->f_count to become 4
but with only 3 open fd's? Don't we have to do an fd_install for the
first one and only 2 f_dupfd's?

2020-01-01 18:50:31

by Dominik Brodowski

[permalink] [raw]
Subject: Re: [PATCH] early init: open /dev/console with O_LARGEFILE

On Wed, Jan 01, 2020 at 12:59:32PM -0500, Arvind Sankar wrote:
> On Wed, Jan 01, 2020 at 09:27:27PM +0800, youling 257 wrote:
> > Unfortunately, test this patch still no help, has system/bin/sh warning.
> >
>
> Just to confirm, the only change needed to make the warning go away is
> reverting the single commit 8243186f0cc7 ("fs: remove ksys_dup()")?

That's what he reported, yes.

Thanks,
Dominik

2020-01-01 19:02:55

by youling 257

[permalink] [raw]
Subject: Re: [PATCH] early init: open /dev/console with O_LARGEFILE

of course, test this revert patch, no the system/bin/sh warning.

2020-01-02 2:42 GMT+08:00, Dominik Brodowski <[email protected]>:
> On Wed, Jan 01, 2020 at 12:59:32PM -0500, Arvind Sankar wrote:
>> On Wed, Jan 01, 2020 at 09:27:27PM +0800, youling 257 wrote:
>> > Unfortunately, test this patch still no help, has system/bin/sh
>> > warning.
>> >
>>
>> Just to confirm, the only change needed to make the warning go away is
>> reverting the single commit 8243186f0cc7 ("fs: remove ksys_dup()")?
>
> That's what he reported, yes.
>
> Thanks,
> Dominik
>

2020-01-01 21:41:51

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] early init: open /dev/console with O_LARGEFILE

On Wed, Jan 1, 2020 at 10:32 AM Arvind Sankar <[email protected]> wrote:
>
> Also, this shouldn't impact the current issue I think, but won't doing
> filp_open followed by 3 f_dupfd's cause the file->f_count to become 4
> but with only 3 open fd's? Don't we have to do an fd_install for the
> first one and only 2 f_dupfd's?

I think *this* is the real reason for the odd regression.

Because we're leaking a file count, the original /dev/console stays
open, and we end up never calling file->f_op->release().

So we don't call tty_release() on that original /dev/console open, and
one effect of that is that we never then call session_clear_tty(),
which should make sure that all the processes in that session ID have
their controlling tty (signal->tty pointer) cleared.

And if that original controlling tty wasn't cleared, then subsequent
calls to set the controlling tty won't do anything, and who knows what
odd session leader confusion we might have.

youling, can you check if - instead of the revert - this simple 'add
an fput' fixes your warning.

I'm not saying that the revert is wrong at this point, but honestly,
I'd like to avoid the "we revert because we didn't understand what
went wrong". It would be good to go "Aaaahhhh, _that_ was the
problem".

Linus


Attachments:
patch.diff (381.00 B)

2020-01-01 22:51:48

by Arvind Sankar

[permalink] [raw]
Subject: Re: [PATCH] early init: open /dev/console with O_LARGEFILE

On Wed, Jan 01, 2020 at 01:39:00PM -0800, Linus Torvalds wrote:
> On Wed, Jan 1, 2020 at 10:32 AM Arvind Sankar <[email protected]> wrote:
> >
> > Also, this shouldn't impact the current issue I think, but won't doing
> > filp_open followed by 3 f_dupfd's cause the file->f_count to become 4
> > but with only 3 open fd's? Don't we have to do an fd_install for the
> > first one and only 2 f_dupfd's?
>
> I think *this* is the real reason for the odd regression.
>
> Because we're leaking a file count, the original /dev/console stays
> open, and we end up never calling file->f_op->release().
>
> So we don't call tty_release() on that original /dev/console open, and
> one effect of that is that we never then call session_clear_tty(),
> which should make sure that all the processes in that session ID have
> their controlling tty (signal->tty pointer) cleared.
>
> And if that original controlling tty wasn't cleared, then subsequent
> calls to set the controlling tty won't do anything, and who knows what
> odd session leader confusion we might have.
>
> youling, can you check if - instead of the revert - this simple 'add
> an fput' fixes your warning.
>
> I'm not saying that the revert is wrong at this point, but honestly,
> I'd like to avoid the "we revert because we didn't understand what
> went wrong". It would be good to go "Aaaahhhh, _that_ was the
> problem".
>
> Linus

Shouldn't that only affect init though? The getty's it spawns should be
in their own sessions.

2020-01-01 23:02:27

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] early init: open /dev/console with O_LARGEFILE

On Wed, Jan 1, 2020 at 2:50 PM Arvind Sankar <[email protected]> wrote:
>
> Shouldn't that only affect init though? The getty's it spawns should be
> in their own sessions.

They *should* be in their own sessions, and clearly this problem
doesn't seem to really affect much anybody else.

But I think youling has some limited and/or odd init userspace, and I
think it gets confused.

So my theory is that because of the file descriptor leak, that "forget
the old controlling tty" doesn't happen, and then subsequent tty opens
don't do the right thing.

Maybe.

But it's the only real semantic change I can see in that whole patch.

Linus

2020-01-01 23:11:11

by Arvind Sankar

[permalink] [raw]
Subject: Re: [PATCH] early init: open /dev/console with O_LARGEFILE

On Wed, Jan 01, 2020 at 03:01:12PM -0800, Linus Torvalds wrote:
> On Wed, Jan 1, 2020 at 2:50 PM Arvind Sankar <[email protected]> wrote:
> >
> > Shouldn't that only affect init though? The getty's it spawns should be
> > in their own sessions.
>
> They *should* be in their own sessions, and clearly this problem
> doesn't seem to really affect much anybody else.
>
> But I think youling has some limited and/or odd init userspace, and I
> think it gets confused.
>
> So my theory is that because of the file descriptor leak, that "forget
> the old controlling tty" doesn't happen, and then subsequent tty opens
> don't do the right thing.
>
> Maybe.
>
> But it's the only real semantic change I can see in that whole patch.
>
> Linus

Ok. If you do end up going with this rather than the revert, one minor
nit with the patch -- if somehow the filp_open succeeds but one of the
f_dupfd's fails, you still need to do an fput to avoid leaking the
reference.

2020-01-02 03:10:46

by youling 257

[permalink] [raw]
Subject: Re: [PATCH] early init: open /dev/console with O_LARGEFILE

2020-01-02 5:39 GMT+08:00, Linus Torvalds <[email protected]>:
> On Wed, Jan 1, 2020 at 10:32 AM Arvind Sankar <[email protected]>
> wrote:
>>
>> Also, this shouldn't impact the current issue I think, but won't doing
>> filp_open followed by 3 f_dupfd's cause the file->f_count to become 4
>> but with only 3 open fd's? Don't we have to do an fd_install for the
>> first one and only 2 f_dupfd's?
>
> I think *this* is the real reason for the odd regression.
>
> Because we're leaking a file count, the original /dev/console stays
> open, and we end up never calling file->f_op->release().
>
> So we don't call tty_release() on that original /dev/console open, and
> one effect of that is that we never then call session_clear_tty(),
> which should make sure that all the processes in that session ID have
> their controlling tty (signal->tty pointer) cleared.
>
> And if that original controlling tty wasn't cleared, then subsequent
> calls to set the controlling tty won't do anything, and who knows what
> odd session leader confusion we might have.
>
> youling, can you check if - instead of the revert - this simple 'add
> an fput' fixes your warning.
>
> I'm not saying that the revert is wrong at this point, but honestly,
> I'd like to avoid the "we revert because we didn't understand what
> went wrong". It would be good to go "Aaaahhhh, _that_ was the
> problem".
>
> Linus
>

test this patch.diff, no the system/bin/sh warning, fixed.

2020-01-02 04:02:30

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] early init: open /dev/console with O_LARGEFILE

On Wed, Jan 1, 2020 at 7:09 PM youling 257 <[email protected]> wrote:
>
> test this patch.diff, no the system/bin/sh warning, fixed.

Good to finally have figured out the silly reason this broke.

I may end up deciding just to revert after all, but even if I do that,
I'll just feel a lot better about knowing _why_ things broke.

Thanks a lot for all the testing you did,

Linus

2020-01-02 04:11:59

by Arvind Sankar

[permalink] [raw]
Subject: Re: [PATCH] early init: open /dev/console with O_LARGEFILE

On Wed, Jan 01, 2020 at 08:00:25PM -0800, Linus Torvalds wrote:
> On Wed, Jan 1, 2020 at 7:09 PM youling 257 <[email protected]> wrote:
> >
> > test this patch.diff, no the system/bin/sh warning, fixed.
>
> Good to finally have figured out the silly reason this broke.
>
> I may end up deciding just to revert after all, but even if I do that,
> I'll just feel a lot better about knowing _why_ things broke.
>
> Thanks a lot for all the testing you did,
>
> Linus

Huh. Feels good to have helped finding the bug, even though I thought it
couldn't *possibly* be the source of the problem.

"There are more things in heaven and earth, Horatio,
Than are dreamt of in your philosophy."

2020-01-02 17:54:09

by Dominik Brodowski

[permalink] [raw]
Subject: Re: [PATCH] early init: open /dev/console with O_LARGEFILE

On Wed, Jan 01, 2020 at 01:39:00PM -0800, Linus Torvalds wrote:
> I'm not saying that the revert is wrong at this point,

Linus, I'd like you to revert the patch nonetheless -- it was quite broken,
it is still broken and probably needs, besides your fix, the O_LARGEFILE
patch I sent out a few days ago.

What is more, I think a different approach is saner, leads to smaller code,
and creates far less risks. And maybe Al Viro likes it a bit more as well :)

At

https://git.kernel.org/pub/scm/linux/kernel/git/brodo/linux.git ksys-next

there are a few proof-of-concept patches for three in-kernel-syscalls:

(1) Both ksys_open() and ksys_unlink() can trivially operate properly on a
kernelspace pointer (by means of calling getname_kernel() instead of
getname()), such as:

-static inline long ksys_unlink(const char __user *pathname)
+/* note: operates on a kernelspace pointer to pathname */
+static inline long ksys_unlink(const char *pathname)
{
- return do_unlinkat(AT_FDCWD, getname(pathname));
+ return do_unlinkat(AT_FDCWD, getname_kernel(pathname));
}

(with all callers of ksys_unlinked() checked, of course).


(2) For ksys_mkdir(), we already have a fully vfs-in-kernelspace variant in
devtmpfs::dev_mkdir(), which we can use for ksys_mkdir() as well.


What do you (and others) think of these alternative approaches? If we should
work in this direction, please revert the patch -- the new patches should go
into v5.6 at earliest, and probably be routed via Al Viro (unless he and/or
you want me to have this tree managed independently, and included in
linux-next separately).

Thanks,
Dominik