2023-04-21 08:24:02

by syzbot

[permalink] [raw]
Subject: [syzbot] [kernel?] KCSAN: data-race in __fput / __tty_hangup (4)

Hello,

syzbot found the following issue on:

HEAD commit: fcd476ea6a88 Merge tag 'urgent-rcu.2023.03.28a' of git://g..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=146618b9c80000
kernel config: https://syzkaller.appspot.com/x/.config?x=f7350c77b8056a38
dashboard link: https://syzkaller.appspot.com/bug?extid=b7c3ba8cdc2f6cf83c21
compiler: Debian clang version 15.0.7, GNU ld (GNU Binutils for Debian) 2.35.2

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/f3d8ce4faab0/disk-fcd476ea.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/fc53d9dee279/vmlinux-fcd476ea.xz
kernel image: https://storage.googleapis.com/syzbot-assets/22ad755d39b2/bzImage-fcd476ea.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]

==================================================================
BUG: KCSAN: data-race in __fput / __tty_hangup

write to 0xffff88813b31e828 of 8 bytes by task 17702 on cpu 0:
__tty_hangup+0x1e2/0x510 drivers/tty/tty_io.c:622
tty_vhangup+0x17/0x20 drivers/tty/tty_io.c:701
pty_close+0x262/0x280 drivers/tty/pty.c:79
tty_release+0x204/0x930 drivers/tty/tty_io.c:1753
__fput+0x245/0x570 fs/file_table.c:321
____fput+0x15/0x20 fs/file_table.c:349
task_work_run+0x123/0x160 kernel/task_work.c:179
get_signal+0xe5c/0xfe0 kernel/signal.c:2635
arch_do_signal_or_restart+0x89/0x2b0 arch/x86/kernel/signal.c:306
exit_to_user_mode_loop+0x6d/0xe0 kernel/entry/common.c:168
exit_to_user_mode_prepare+0x6a/0xa0 kernel/entry/common.c:204
__syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline]
syscall_exit_to_user_mode+0x26/0x140 kernel/entry/common.c:297
do_syscall_64+0x4d/0xc0 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x63/0xcd

read to 0xffff88813b31e828 of 8 bytes by task 17705 on cpu 1:
__fput+0x21c/0x570 fs/file_table.c:320
____fput+0x15/0x20 fs/file_table.c:349
task_work_run+0x123/0x160 kernel/task_work.c:179
get_signal+0xe5c/0xfe0 kernel/signal.c:2635
arch_do_signal_or_restart+0x89/0x2b0 arch/x86/kernel/signal.c:306
exit_to_user_mode_loop+0x6d/0xe0 kernel/entry/common.c:168
exit_to_user_mode_prepare+0x6a/0xa0 kernel/entry/common.c:204
__syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline]
syscall_exit_to_user_mode+0x26/0x140 kernel/entry/common.c:297
do_syscall_64+0x4d/0xc0 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x63/0xcd

value changed: 0xffffffff84e91ed0 -> 0xffffffff84e91dc0

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 17705 Comm: syz-executor.0 Not tainted 6.3.0-rc4-syzkaller-00034-gfcd476ea6a88 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/17/2023
==================================================================


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.


2023-04-21 08:36:11

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [syzbot] [kernel?] KCSAN: data-race in __fput / __tty_hangup (4)

On Fri, 21 Apr 2023 at 10:18, syzbot
<[email protected]> wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: fcd476ea6a88 Merge tag 'urgent-rcu.2023.03.28a' of git://g..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=146618b9c80000
> kernel config: https://syzkaller.appspot.com/x/.config?x=f7350c77b8056a38
> dashboard link: https://syzkaller.appspot.com/bug?extid=b7c3ba8cdc2f6cf83c21
> compiler: Debian clang version 15.0.7, GNU ld (GNU Binutils for Debian) 2.35.2
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/f3d8ce4faab0/disk-fcd476ea.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/fc53d9dee279/vmlinux-fcd476ea.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/22ad755d39b2/bzImage-fcd476ea.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: [email protected]

If I am reading this correctly, this race can lead to NULL derefs
among other things.
hung_up_tty_fops does not have splice_read, while other fops have.

So the following code in splice can execute NULL callback:

if (unlikely(!in->f_op->splice_read))
return warn_unsupported(in, "read");
return in->f_op->splice_read(in, ppos, pipe, len, flags);




> ==================================================================
> BUG: KCSAN: data-race in __fput / __tty_hangup
>
> write to 0xffff88813b31e828 of 8 bytes by task 17702 on cpu 0:
> __tty_hangup+0x1e2/0x510 drivers/tty/tty_io.c:622
> tty_vhangup+0x17/0x20 drivers/tty/tty_io.c:701
> pty_close+0x262/0x280 drivers/tty/pty.c:79
> tty_release+0x204/0x930 drivers/tty/tty_io.c:1753
> __fput+0x245/0x570 fs/file_table.c:321
> ____fput+0x15/0x20 fs/file_table.c:349
> task_work_run+0x123/0x160 kernel/task_work.c:179
> get_signal+0xe5c/0xfe0 kernel/signal.c:2635
> arch_do_signal_or_restart+0x89/0x2b0 arch/x86/kernel/signal.c:306
> exit_to_user_mode_loop+0x6d/0xe0 kernel/entry/common.c:168
> exit_to_user_mode_prepare+0x6a/0xa0 kernel/entry/common.c:204
> __syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline]
> syscall_exit_to_user_mode+0x26/0x140 kernel/entry/common.c:297
> do_syscall_64+0x4d/0xc0 arch/x86/entry/common.c:86
> entry_SYSCALL_64_after_hwframe+0x63/0xcd
>
> read to 0xffff88813b31e828 of 8 bytes by task 17705 on cpu 1:
> __fput+0x21c/0x570 fs/file_table.c:320
> ____fput+0x15/0x20 fs/file_table.c:349
> task_work_run+0x123/0x160 kernel/task_work.c:179
> get_signal+0xe5c/0xfe0 kernel/signal.c:2635
> arch_do_signal_or_restart+0x89/0x2b0 arch/x86/kernel/signal.c:306
> exit_to_user_mode_loop+0x6d/0xe0 kernel/entry/common.c:168
> exit_to_user_mode_prepare+0x6a/0xa0 kernel/entry/common.c:204
> __syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline]
> syscall_exit_to_user_mode+0x26/0x140 kernel/entry/common.c:297
> do_syscall_64+0x4d/0xc0 arch/x86/entry/common.c:86
> entry_SYSCALL_64_after_hwframe+0x63/0xcd
>
> value changed: 0xffffffff84e91ed0 -> 0xffffffff84e91dc0
>
> Reported by Kernel Concurrency Sanitizer on:
> CPU: 1 PID: 17705 Comm: syz-executor.0 Not tainted 6.3.0-rc4-syzkaller-00034-gfcd476ea6a88 #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/17/2023
> ==================================================================
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at [email protected].
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

2023-04-21 15:23:03

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [syzbot] [kernel?] KCSAN: data-race in __fput / __tty_hangup (4)

On 2023/04/21 17:21, Dmitry Vyukov wrote:
> If I am reading this correctly, this race can lead to NULL derefs
> among other things.
> hung_up_tty_fops does not have splice_read, while other fops have.
>
> So the following code in splice can execute NULL callback:
>
> if (unlikely(!in->f_op->splice_read))
> return warn_unsupported(in, "read");
> return in->f_op->splice_read(in, ppos, pipe, len, flags);
>

__fput(file) is called when the last reference to file is released.
Since __tty_hangup() traverses tty->tty_files under tty->files_lock,
tty_add_file() needs to hold a ref before adding to tty->tty_files
in order to defer concurrent __fput() by other threads?

diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index 36fb945fdad4..2838703d48cf 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -197,7 +197,7 @@ void tty_add_file(struct tty_struct *tty, struct file *file)
struct tty_file_private *priv = file->private_data;

priv->tty = tty;
- priv->file = file;
+ priv->file = get_file(file);

spin_lock(&tty->files_lock);
list_add(&priv->list, &tty->tty_files);
@@ -228,6 +228,7 @@ static void tty_del_file(struct file *file)
spin_lock(&tty->files_lock);
list_del(&priv->list);
spin_unlock(&tty->files_lock);
+ fput(file);
tty_free_file(file);
}


2023-04-21 16:11:27

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [syzbot] [kernel?] KCSAN: data-race in __fput / __tty_hangup (4)

On 2023/04/22 0:12, Tetsuo Handa wrote:
> __fput(file) is called when the last reference to file is released.
> Since __tty_hangup() traverses tty->tty_files under tty->files_lock,
> tty_add_file() needs to hold a ref before adding to tty->tty_files
> in order to defer concurrent __fput() by other threads?

Hmm, this did not work; get_file() prevents tty_release().

2023-04-23 13:33:42

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [syzbot] [kernel?] KCSAN: data-race in __fput / __tty_hangup (4)

On 2023/04/21 17:21, Dmitry Vyukov wrote:
> On Fri, 21 Apr 2023 at 10:18, syzbot
> <[email protected]> wrote:
>>
>> Hello,
>>
>> syzbot found the following issue on:
>>
>> HEAD commit: fcd476ea6a88 Merge tag 'urgent-rcu.2023.03.28a' of git://g..
>> git tree: upstream
>> console output: https://syzkaller.appspot.com/x/log.txt?x=146618b9c80000
>> kernel config: https://syzkaller.appspot.com/x/.config?x=f7350c77b8056a38
>> dashboard link: https://syzkaller.appspot.com/bug?extid=b7c3ba8cdc2f6cf83c21
>> compiler: Debian clang version 15.0.7, GNU ld (GNU Binutils for Debian) 2.35.2
>>
>> Unfortunately, I don't have any reproducer for this issue yet.
>>
>> Downloadable assets:
>> disk image: https://storage.googleapis.com/syzbot-assets/f3d8ce4faab0/disk-fcd476ea.raw.xz
>> vmlinux: https://storage.googleapis.com/syzbot-assets/fc53d9dee279/vmlinux-fcd476ea.xz
>> kernel image: https://storage.googleapis.com/syzbot-assets/22ad755d39b2/bzImage-fcd476ea.xz
>>
>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>> Reported-by: [email protected]
>
> If I am reading this correctly, this race can lead to NULL derefs
> among other things.
> hung_up_tty_fops does not have splice_read, while other fops have.
>
> So the following code in splice can execute NULL callback:
>
> if (unlikely(!in->f_op->splice_read))
> return warn_unsupported(in, "read");
> return in->f_op->splice_read(in, ppos, pipe, len, flags);

Hmm, it seems to me that we need multiple patches (which would become too big to
backport) for fixing this bug.

First step (which Dmitry mentioned) is to avoid potential NULL pointer dereferences
caused by

if (!f_op->$callbackname) {
return error;
}
return f_op->$callbackname($args);

pattern, for the next step will touch too many locations to change all at once whereas
the first step could be handled by implementing dummy function for all missing $callbackname.

Next step is to convert from

if (!f_op->$callbackname) {
return error;
}
return f_op->$callbackname($args);

to

fn = READ_ONCE(f_op->$callbackname);
if (!fn) {
return error;
}
return fn($args);

pattern.

Last step is to silence KCSAN by wrapping READ_ONCE()/WRITE_ONCE() using data_race() macro.
Each data_race() usage wants a comment, thus would want a macro if possible...

2023-04-23 14:13:38

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [syzbot] [kernel?] KCSAN: data-race in __fput / __tty_hangup (4)

On Sun, Apr 23, 2023 at 10:28:58PM +0900, Tetsuo Handa wrote:
> On 2023/04/21 17:21, Dmitry Vyukov wrote:
> > On Fri, 21 Apr 2023 at 10:18, syzbot
> > <[email protected]> wrote:
> >>
> >> Hello,
> >>
> >> syzbot found the following issue on:
> >>
> >> HEAD commit: fcd476ea6a88 Merge tag 'urgent-rcu.2023.03.28a' of git://g..
> >> git tree: upstream
> >> console output: https://syzkaller.appspot.com/x/log.txt?x=146618b9c80000
> >> kernel config: https://syzkaller.appspot.com/x/.config?x=f7350c77b8056a38
> >> dashboard link: https://syzkaller.appspot.com/bug?extid=b7c3ba8cdc2f6cf83c21
> >> compiler: Debian clang version 15.0.7, GNU ld (GNU Binutils for Debian) 2.35.2
> >>
> >> Unfortunately, I don't have any reproducer for this issue yet.
> >>
> >> Downloadable assets:
> >> disk image: https://storage.googleapis.com/syzbot-assets/f3d8ce4faab0/disk-fcd476ea.raw.xz
> >> vmlinux: https://storage.googleapis.com/syzbot-assets/fc53d9dee279/vmlinux-fcd476ea.xz
> >> kernel image: https://storage.googleapis.com/syzbot-assets/22ad755d39b2/bzImage-fcd476ea.xz
> >>
> >> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> >> Reported-by: [email protected]
> >
> > If I am reading this correctly, this race can lead to NULL derefs
> > among other things.
> > hung_up_tty_fops does not have splice_read, while other fops have.
> >
> > So the following code in splice can execute NULL callback:
> >
> > if (unlikely(!in->f_op->splice_read))
> > return warn_unsupported(in, "read");
> > return in->f_op->splice_read(in, ppos, pipe, len, flags);
>
> Hmm, it seems to me that we need multiple patches (which would become too big to
> backport) for fixing this bug.
>
> First step (which Dmitry mentioned) is to avoid potential NULL pointer dereferences
> caused by
>
> if (!f_op->$callbackname) {
> return error;
> }
> return f_op->$callbackname($args);
>
> pattern, for the next step will touch too many locations to change all at once whereas
> the first step could be handled by implementing dummy function for all missing $callbackname.

I do not understand, what "callbackname" is the problem here? Just
splice_read? Something else? And where would it need to be modified
and why can't we do it in one place only (i.e. install a default handler
instead.)

Also, pointer function operations like this are dirt slow, be wary of
adding additional ones if possible, on fast paths.

> Next step is to convert from
>
> if (!f_op->$callbackname) {
> return error;
> }
> return f_op->$callbackname($args);
>
> to
>
> fn = READ_ONCE(f_op->$callbackname);
> if (!fn) {
> return error;
> }
> return fn($args);
>
> pattern.

Why? What does this solve differently than the first one? What can
change the fops pointer between the check and call path? If something
can change it, then do NOT make that type of check in the first place
(or put a proper lock in place.)

thanks,

greg k-h

2023-04-23 14:13:39

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [syzbot] [kernel?] KCSAN: data-race in __fput / __tty_hangup (4)

On Sun, Apr 23, 2023 at 10:28:58PM +0900, Tetsuo Handa wrote:
> On 2023/04/21 17:21, Dmitry Vyukov wrote:
> > On Fri, 21 Apr 2023 at 10:18, syzbot
> > <[email protected]> wrote:
> >>
> >> Hello,
> >>
> >> syzbot found the following issue on:
> >>
> >> HEAD commit: fcd476ea6a88 Merge tag 'urgent-rcu.2023.03.28a' of git://g..
> >> git tree: upstream
> >> console output: https://syzkaller.appspot.com/x/log.txt?x=146618b9c80000
> >> kernel config: https://syzkaller.appspot.com/x/.config?x=f7350c77b8056a38
> >> dashboard link: https://syzkaller.appspot.com/bug?extid=b7c3ba8cdc2f6cf83c21
> >> compiler: Debian clang version 15.0.7, GNU ld (GNU Binutils for Debian) 2.35.2
> >>
> >> Unfortunately, I don't have any reproducer for this issue yet.
> >>
> >> Downloadable assets:
> >> disk image: https://storage.googleapis.com/syzbot-assets/f3d8ce4faab0/disk-fcd476ea.raw.xz
> >> vmlinux: https://storage.googleapis.com/syzbot-assets/fc53d9dee279/vmlinux-fcd476ea.xz
> >> kernel image: https://storage.googleapis.com/syzbot-assets/22ad755d39b2/bzImage-fcd476ea.xz
> >>
> >> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> >> Reported-by: [email protected]
> >
> > If I am reading this correctly, this race can lead to NULL derefs
> > among other things.
> > hung_up_tty_fops does not have splice_read, while other fops have.
> >
> > So the following code in splice can execute NULL callback:
> >
> > if (unlikely(!in->f_op->splice_read))
> > return warn_unsupported(in, "read");
> > return in->f_op->splice_read(in, ppos, pipe, len, flags);
>
> Hmm, it seems to me that we need multiple patches (which would become too big to
> backport) for fixing this bug.

Fix the bug properly first. And then only start to worry about stable
kernels, don't let the existance of them affect your development efforts
at all please.

thanks,

greg k-h

2023-04-23 14:27:10

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [syzbot] [kernel?] KCSAN: data-race in __fput / __tty_hangup (4)

On 2023/04/23 23:03, Greg Kroah-Hartman wrote:
>> Next step is to convert from
>>
>> if (!f_op->$callbackname) {
>> return error;
>> }
>> return f_op->$callbackname($args);
>>
>> to
>>
>> fn = READ_ONCE(f_op->$callbackname);
>> if (!fn) {
>> return error;
>> }
>> return fn($args);
>>
>> pattern.
>
> Why? What does this solve differently than the first one? What can
> change the fops pointer between the check and call path? If something
> can change it, then do NOT make that type of check in the first place
> (or put a proper lock in place.)

__tty_hangup() is changing filp->f_op like

spin_lock(&tty->files_lock);
/* This breaks for file handles being sent over AF_UNIX sockets ? */
list_for_each_entry(priv, &tty->tty_files, list) {
filp = priv->file;
if (filp->f_op->write_iter == redirected_tty_write)
cons_filp = filp;
if (filp->f_op->write_iter != tty_write)
continue;
closecount++;
__tty_fasync(-1, filp, 0); /* can't block */
filp->f_op = &hung_up_tty_fops;
}
spin_unlock(&tty->files_lock);

but filp->f_op readers are (of course) not protected by tty->files_lock.
Like Dmitry mentioned

hung_up_tty_fops does not have splice_read, while other fops have.

, patterns like

if (unlikely(!in->f_op->splice_read))
return warn_unsupported(in, "read");
return in->f_op->splice_read(in, ppos, pipe, len, flags);

is not safe (if compiler generates a code that re-reads the pointer).

Using READ_ONCE() is for preventing the compiler to re-read, and using data_race()
is for teaching KCSAN that race while happens during READ_ONCE()/WRITE_ONCE()
is acceptable.

>> First step (which Dmitry mentioned) is to avoid potential NULL pointer dereferences
>> caused by
>>
>> if (!f_op->$callbackname) {
>> return error;
>> }
>> return f_op->$callbackname($args);
>>
>> pattern, for the next step will touch too many locations to change all at once whereas
>> the first step could be handled by implementing dummy function for all missing $callbackname.
>
> I do not understand, what "callbackname" is the problem here? Just
> splice_read? Something else?

I haven't checked.

> And where would it need to be modified
> and why can't we do it in one place only (i.e. install a default handler
> instead.)

Yes, adding a dummy splice_read callback handler to hung_up_tty_fops is
what I call the first step.

2023-04-23 23:47:23

by Al Viro

[permalink] [raw]
Subject: Re: [syzbot] [kernel?] KCSAN: data-race in __fput / __tty_hangup (4)

On Sat, Apr 22, 2023 at 12:12:02AM +0900, Tetsuo Handa wrote:
> On 2023/04/21 17:21, Dmitry Vyukov wrote:
> > If I am reading this correctly, this race can lead to NULL derefs
> > among other things.
> > hung_up_tty_fops does not have splice_read, while other fops have.
> >
> > So the following code in splice can execute NULL callback:
> >
> > if (unlikely(!in->f_op->splice_read))
> > return warn_unsupported(in, "read");
> > return in->f_op->splice_read(in, ppos, pipe, len, flags);
> >
>
> __fput(file) is called when the last reference to file is released.
> Since __tty_hangup() traverses tty->tty_files under tty->files_lock,
> tty_add_file() needs to hold a ref before adding to tty->tty_files
> in order to defer concurrent __fput() by other threads?
>
> diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
> index 36fb945fdad4..2838703d48cf 100644
> --- a/drivers/tty/tty_io.c
> +++ b/drivers/tty/tty_io.c
> @@ -197,7 +197,7 @@ void tty_add_file(struct tty_struct *tty, struct file *file)
> struct tty_file_private *priv = file->private_data;
>
> priv->tty = tty;
> - priv->file = file;
> + priv->file = get_file(file);

This is broken. Simple open() + close() on a tty will leak the damn thing -
that extra reference will stick around, possibly all the way until reboot.

As for the original report - add a (failing) ->splice_read() in hung_ut_tty_fops
to deal with the original problem.

2023-04-24 01:27:31

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [syzbot] [kernel?] KCSAN: data-race in __fput / __tty_hangup (4)

On 2023/04/24 9:44, Al Viro wrote:
> Do you mean doing that in method instances that are present in tty_fops
> you mean doing that in method instances that are present in tty_fops
> and different in hung_up_tty_fops?

I meant, remove hung_up_tty_fops, and embed callbacks defined in hung_up_tty_fops into
tty_fops. For example, tty_read() (from "struct file_operations tty_fops"->read_iter)
will be changed like

static ssize_t tty_read(struct kiocb *iocb, struct iov_iter *to)
{
int i;
struct file *file = iocb->ki_filp;
struct inode *inode = file_inode(file);
struct tty_struct *tty = file_tty(file);
struct tty_ldisc *ld;

+ if (data_race(file->tty_hangup)) {
+ return hung_up_tty_read(iocb, to);
+ }
+
if (tty_paranoia_check(tty, inode, "tty_read"))
return -EIO;
if (!tty || tty_io_error(tty))
return -EIO;

/* We want to wait for the line discipline to sort out in this
* situation.
*/
ld = tty_ldisc_ref_wait(tty);
if (!ld)
return hung_up_tty_read(iocb, to);
i = -EIO;
if (ld->ops->read)
i = iterate_tty_read(ld, tty, file, to);
tty_ldisc_deref(ld);

if (i > 0)
tty_update_time(&inode->i_atime);

return i;
}

in order to avoid wrapping filp->f_op->* dereferences using data_race().

2023-04-24 01:45:16

by Al Viro

[permalink] [raw]
Subject: Re: [syzbot] [kernel?] KCSAN: data-race in __fput / __tty_hangup (4)

On Mon, Apr 24, 2023 at 08:55:58AM +0900, Tetsuo Handa wrote:
> On 2023/04/24 8:34, Al Viro wrote:
> > As for the original report - add a (failing) ->splice_read() in hung_ut_tty_fops
> > to deal with the original problem.
>
> Yes, adding a dummy splice_read callback is OK for avoiding NULL pointer dereference.
> But we need more changes for avoiding KCSAN race reporting.
>
> Are you OK with https://lkml.kernel.org/r/[email protected]
> which will require touching so many locations ?
>
> If you want tty layer handle this race without rewriting all f_op dereferences,
> we would need to replace
>
> filp->f_op = &hung_up_tty_fops;
>
> with
>
> data_race(filp->some_flags_for_tty = true);
>
> rather than
>
> data_race(filp->f_op = &hung_up_tty_fops);
>
> and check
>
> if (data_race(filp->some_flags_for_tty)) {
> return error;
> }
>
> from each "struct tty_operations" callback function.

What struct tty_operations? It's file_operations, unfortunately, and
their calls are on quite a few fast paths.

Do you mean doing that in method instances that are present in tty_fops
you mean doing that in method instances that are present in tty_fops
and different in hung_up_tty_fops?

2023-04-24 01:53:51

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [syzbot] [kernel?] KCSAN: data-race in __fput / __tty_hangup (4)

On 2023/04/24 8:34, Al Viro wrote:
> As for the original report - add a (failing) ->splice_read() in hung_ut_tty_fops
> to deal with the original problem.

Yes, adding a dummy splice_read callback is OK for avoiding NULL pointer dereference.
But we need more changes for avoiding KCSAN race reporting.

Are you OK with https://lkml.kernel.org/r/[email protected]
which will require touching so many locations ?

If you want tty layer handle this race without rewriting all f_op dereferences,
we would need to replace

filp->f_op = &hung_up_tty_fops;

with

data_race(filp->some_flags_for_tty = true);

rather than

data_race(filp->f_op = &hung_up_tty_fops);

and check

if (data_race(filp->some_flags_for_tty)) {
return error;
}

from each "struct tty_operations" callback function.

2023-04-25 14:52:27

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [syzbot] [kernel?] KCSAN: data-race in __fput / __tty_hangup (4)

tty_open() is doing "filp->f_op = &tty_fops;".
__tty_hangup() is doing "filp->f_op = &hung_up_tty_fops;".
tty_hung_up_p() is doing "filp->f_op == &hung_up_tty_fops",
and most functions are checking tty_hung_up_p().

Therefore, if we can remember whether __tty_hangup() did
"filp->f_op = &hung_up_tty_fops;" without changing filp->f_op,
we can reuse tty_hung_up_p().

Since tty_open() allocates sizeof("struct tty_file_private") bytes
of memory associated with each "struct file", we can record whether
__tty_hangup() did "filp->f_op = &hung_up_tty_fops;" using this memory.

Therefore, diff will be something like this?
(Explanation and question follows this diff.)

drivers/tty/tty_io.c | 36 +++++++++++++++++-------------------
include/linux/tty.h | 1 +
2 files changed, 18 insertions(+), 19 deletions(-)

diff --git a/include/linux/tty.h b/include/linux/tty.h
index 093935e97f42..d7fa18f8c526 100644
--- a/include/linux/tty.h
+++ b/include/linux/tty.h
@@ -255,6 +255,7 @@ struct tty_file_private {
struct tty_struct *tty;
struct file *file;
struct list_head list;
+ bool hung;
};

/**
diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index 36fb945fdad4..bebd236679f0 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -182,7 +182,7 @@ int tty_alloc_file(struct file *file)
{
struct tty_file_private *priv;

- priv = kmalloc(sizeof(*priv), GFP_KERNEL);
+ priv = kzalloc(sizeof(*priv), GFP_KERNEL);
if (!priv)
return -ENOMEM;

@@ -491,17 +491,6 @@ static const struct file_operations console_fops = {
.fasync = tty_fasync,
};

-static const struct file_operations hung_up_tty_fops = {
- .llseek = no_llseek,
- .read_iter = hung_up_tty_read,
- .write_iter = hung_up_tty_write,
- .poll = hung_up_tty_poll,
- .unlocked_ioctl = hung_up_tty_ioctl,
- .compat_ioctl = hung_up_tty_compat_ioctl,
- .release = tty_release,
- .fasync = hung_up_tty_fasync,
-};
-
static DEFINE_SPINLOCK(redirect_lock);
static struct file *redirect;

@@ -619,7 +608,8 @@ static void __tty_hangup(struct tty_struct *tty, int exit_session)
continue;
closecount++;
__tty_fasync(-1, filp, 0); /* can't block */
- filp->f_op = &hung_up_tty_fops;
+ /* Accept race with tty_hung_up_p() test. */
+ data_race(priv->hung = true);
}
spin_unlock(&tty->files_lock);

@@ -743,7 +733,8 @@ void tty_vhangup_session(struct tty_struct *tty)
*/
int tty_hung_up_p(struct file *filp)
{
- return (filp && filp->f_op == &hung_up_tty_fops);
+ /* Accept race with __tty_hangup(). */
+ return filp && data_race(((struct tty_file_private *) filp->private_data)->hung);
}
EXPORT_SYMBOL(tty_hung_up_p);

@@ -911,6 +902,8 @@ static ssize_t tty_read(struct kiocb *iocb, struct iov_iter *to)
struct tty_struct *tty = file_tty(file);
struct tty_ldisc *ld;

+ if (tty_hung_up_p(file))
+ return hung_up_tty_read(iocb, to);
if (tty_paranoia_check(tty, inode, "tty_read"))
return -EIO;
if (!tty || tty_io_error(tty))
@@ -1073,6 +1066,8 @@ static ssize_t file_tty_write(struct file *file, struct kiocb *iocb, struct iov_
struct tty_ldisc *ld;
ssize_t ret;

+ if (tty_hung_up_p(file))
+ return hung_up_tty_write(iocb, from);
if (tty_paranoia_check(tty, file_inode(file), "tty_write"))
return -EIO;
if (!tty || !tty->ops->write || tty_io_error(tty))
@@ -2159,11 +2154,6 @@ static int tty_open(struct inode *inode, struct file *filp)
return retval;

schedule();
- /*
- * Need to reset f_op in case a hangup happened.
- */
- if (tty_hung_up_p(filp))
- filp->f_op = &tty_fops;
goto retry_open;
}
clear_bit(TTY_HUPPED, &tty->flags);
@@ -2197,6 +2187,8 @@ static __poll_t tty_poll(struct file *filp, poll_table *wait)
struct tty_ldisc *ld;
__poll_t ret = 0;

+ if (tty_hung_up_p(filp))
+ return hung_up_tty_poll(filp, wait);
if (tty_paranoia_check(tty, file_inode(filp), "tty_poll"))
return 0;

@@ -2249,6 +2241,8 @@ static int tty_fasync(int fd, struct file *filp, int on)
struct tty_struct *tty = file_tty(filp);
int retval = -ENOTTY;

+ if (tty_hung_up_p(filp))
+ return hung_up_tty_fasync(fd, filp, on);
tty_lock(tty);
if (!tty_hung_up_p(filp))
retval = __tty_fasync(fd, filp, on);
@@ -2658,6 +2652,8 @@ long tty_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
int retval;
struct tty_ldisc *ld;

+ if (tty_hung_up_p(file))
+ return hung_up_tty_ioctl(file, cmd, arg);
if (tty_paranoia_check(tty, file_inode(file), "tty_ioctl"))
return -EINVAL;

@@ -2943,6 +2939,8 @@ static long tty_compat_ioctl(struct file *file, unsigned int cmd,
return tty_ioctl(file, cmd, arg);
}

+ if (tty_hung_up_p(file))
+ return hung_up_tty_compat_ioctl(file, cmd, arg);
if (tty_paranoia_check(tty, file_inode(file), "tty_ioctl"))
return -EINVAL;


Since tty_read() is called via file->f_op->read_iter() from call_read_iter() from
generic_file_splice_read(), no change is needed for tty_fops->splice_read.

Since tty_write() is called via file->f_op->write_iter() from call_write_iter()
from do_iter_readv_writev() from do_iter_write() from vfs_iter_write() from
iter_file_splice_write(), no change is needed for tty_fops->splice_write.

Since both tty_fops and hung_up_tty_fops point to the same callback for
llseek/release, no change is needed for tty_fops->llseek and tty_fops->release.

Regarding tty_read()/tty_write()/tty_poll()/tty_ioctl()/tty_compat_ioctl(),
respectively embed hung_up_tty_read()/hung_up_tty_write()/hung_up_tty_poll()/
hung_up_tty_ioctl()/hung_up_tty_compat_ioctl() right before tty_paranoia_check().

Regarding tty_fasync(), embed hung_up_tty_fasync() right before tty_lock() in
tty_fasync() rather than tty_paranoia_check() in __tty_fasync(), for
tty_hung_up_p() is checked right after tty_lock() returned.

I don't know what changes are required for tty_open() and tty_show_fdinfo().
I assume that making no change for tty_show_fdinfo() is harmless.
But how does "hung_up_tty_fops does not call tty_open()" affects here?

2023-04-25 16:16:07

by Al Viro

[permalink] [raw]
Subject: Re: [syzbot] [kernel?] KCSAN: data-race in __fput / __tty_hangup (4)

On Tue, Apr 25, 2023 at 11:47:20PM +0900, Tetsuo Handa wrote:
> @@ -743,7 +733,8 @@ void tty_vhangup_session(struct tty_struct *tty)
> */
> int tty_hung_up_p(struct file *filp)
> {
> - return (filp && filp->f_op == &hung_up_tty_fops);
> + /* Accept race with __tty_hangup(). */
> + return filp && data_race(((struct tty_file_private *) filp->private_data)->hung);
> }
> EXPORT_SYMBOL(tty_hung_up_p);

Umm... Are you sure we never call that for non-TTY files? Seeing that
it's exported and all such... For internal uses (tty_read(), etc.) the
check for file being NULL is pointless; for general-purpose primitive
we probably want to check ->f_op as well...

> @@ -2159,11 +2154,6 @@ static int tty_open(struct inode *inode, struct file *filp)
> return retval;
>
> schedule();
> - /*
> - * Need to reset f_op in case a hangup happened.
> - */
> - if (tty_hung_up_p(filp))
> - filp->f_op = &tty_fops;

That needs a comment in commit message - hangup indication doesn't
need to be reset here, since tty_release() has freed the entire thing
that used to hang off the ->private_data.

> I don't know what changes are required for tty_open() and tty_show_fdinfo().
> I assume that making no change for tty_show_fdinfo() is harmless.
> But how does "hung_up_tty_fops does not call tty_open()" affects here?

It doesn't - file->f_op->open() is only called once, and only on fresh
struct file.

2023-04-25 22:25:22

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [syzbot] [kernel?] KCSAN: data-race in __fput / __tty_hangup (4)

On 2023/04/26 1:03, Al Viro wrote:
> On Tue, Apr 25, 2023 at 11:47:20PM +0900, Tetsuo Handa wrote:
>> @@ -743,7 +733,8 @@ void tty_vhangup_session(struct tty_struct *tty)
>> */
>> int tty_hung_up_p(struct file *filp)
>> {
>> - return (filp && filp->f_op == &hung_up_tty_fops);
>> + /* Accept race with __tty_hangup(). */
>> + return filp && data_race(((struct tty_file_private *) filp->private_data)->hung);
>> }
>> EXPORT_SYMBOL(tty_hung_up_p);
>
> Umm... Are you sure we never call that for non-TTY files? Seeing that
> it's exported and all such... For internal uses (tty_read(), etc.) the
> check for file being NULL is pointless; for general-purpose primitive
> we probably want to check ->f_op as well...

Indeed. Since drivers/tty/hvc/hvsi.c drivers/tty/n_hdlc.c drivers/tty/n_tty.c
are calling tty_hung_up_p(), this check should be

filp && filp->f_op == &tty_fops && data_race(((struct tty_file_private *) filp->private_data)->hung);

.

>> I don't know what changes are required for tty_open() and tty_show_fdinfo().
>> I assume that making no change for tty_show_fdinfo() is harmless.
>> But how does "hung_up_tty_fops does not call tty_open()" affects here?
>
> It doesn't - file->f_op->open() is only called once, and only on fresh
> struct file.

OK. Then, we can continue tty_hung_up_p() approach.

2023-04-26 11:26:24

by Tetsuo Handa

[permalink] [raw]
Subject: [PATCH] tty: tty_io: remove hung_up_tty_fops

syzbot is reporting data race between __tty_hangup() and __fput(), for
filp->f_op readers are not holding tty->files_lock.

Dmitry Vyukov mentioned that this race has possibility of NULL pointer
dereference, for tty_fops implements e.g. splice_read callback whereas
hung_up_tty_fops does not.

CPU0 CPU1
---- ----
do_splice_to() {
__tty_hangup() {
// f_op->splice_read was generic_file_splice_read
if (unlikely(!in->f_op->splice_read))
return warn_unsupported(in, "read");
filp->f_op = &hung_up_tty_fops;
// f_op->splice_read is now NULL
return in->f_op->splice_read(in, ppos, pipe, len, flags);
}
}

If we care about only NULL pointer dereference, implementing missing
callbacks to hung_up_tty_fops is fine. But if we also care about KCSAN
reports, we will need to wrap all filp->f_op usages which are reachable
via tty_fops callbacks using data_race().

Such wrapping is overkill as a fix for tty_io code. Therefore, instead of
implementing missing callbacks, stop updating filp->f_op and remove
hung_up_tty_fops. Then, changes will be limited to within tty_io code.

tty_open() is doing "filp->f_op = &tty_fops;".
__tty_hangup() is doing "filp->f_op = &hung_up_tty_fops;".
tty_hung_up_p() is doing "filp->f_op == &hung_up_tty_fops", and
most functions are checking tty_hung_up_p().

Since tty_open() allocates "struct tty_file_private" for each
"struct file", we can remember whether __tty_hangup() was called
by adding a flag to "struct tty_file_private".

Below is detail of where/what to change.

Regarding __tty_hangup(), replace "filp->f_op = &hung_up_tty_fops;" with
setting the above-mentioned flag.

Regarding tty_hungup_p(), replace "filp->f_op == &hung_up_tty_fops" with
"filp->f_op == &tty_fops" and check the above-mentioned flag.

Regarding tty_open(), just remove "filp->f_op = &tty_fops;" because
"struct tty_file_private" was already released by tty_del_file() from
tty_release() when control reaches this line.

Regarding tty_{read,write,poll,ioctl,compat_ioctl}(), respectively embed
hung_up_tty_{read,write,poll,ioctl,compat_ioctl}() right before
tty_paranoia_check().

Regarding tty_fasync(), embed hung_up_tty_fasync() right before tty_lock()
in tty_fasync() rather than tty_paranoia_check() in __tty_fasync(), for
tty_hung_up_p() is checked right after tty_lock() returned.

Since tty_read() is called via file->f_op->read_iter() from
call_read_iter() from generic_file_splice_read(), no change is needed for
tty_fops->splice_read.

Since tty_write() is called via file->f_op->write_iter() from
call_write_iter() from do_iter_readv_writev() from do_iter_write() from
vfs_iter_write() from iter_file_splice_write(), no change is needed for
tty_fops->splice_write.

Since both tty_fops and hung_up_tty_fops point to the same callback for
llseek/release, no change is needed for tty_fops->{llseek,release}.

Finally, remove hung_up_tty_fops.

Reported-by: syzbot <[email protected]>
Link: https://syzkaller.appspot.com/bug?extid=b7c3ba8cdc2f6cf83c21
Signed-off-by: Tetsuo Handa <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
---
drivers/tty/tty_io.c | 37 ++++++++++++++++++-------------------
include/linux/tty.h | 1 +
2 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index 36fb945fdad4..c1748e4da233 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -182,7 +182,7 @@ int tty_alloc_file(struct file *file)
{
struct tty_file_private *priv;

- priv = kmalloc(sizeof(*priv), GFP_KERNEL);
+ priv = kzalloc(sizeof(*priv), GFP_KERNEL);
if (!priv)
return -ENOMEM;

@@ -491,17 +491,6 @@ static const struct file_operations console_fops = {
.fasync = tty_fasync,
};

-static const struct file_operations hung_up_tty_fops = {
- .llseek = no_llseek,
- .read_iter = hung_up_tty_read,
- .write_iter = hung_up_tty_write,
- .poll = hung_up_tty_poll,
- .unlocked_ioctl = hung_up_tty_ioctl,
- .compat_ioctl = hung_up_tty_compat_ioctl,
- .release = tty_release,
- .fasync = hung_up_tty_fasync,
-};
-
static DEFINE_SPINLOCK(redirect_lock);
static struct file *redirect;

@@ -619,7 +608,8 @@ static void __tty_hangup(struct tty_struct *tty, int exit_session)
continue;
closecount++;
__tty_fasync(-1, filp, 0); /* can't block */
- filp->f_op = &hung_up_tty_fops;
+ /* Accept race with tty_hung_up_p() test. */
+ data_race(priv->hung = true);
}
spin_unlock(&tty->files_lock);

@@ -743,7 +733,9 @@ void tty_vhangup_session(struct tty_struct *tty)
*/
int tty_hung_up_p(struct file *filp)
{
- return (filp && filp->f_op == &hung_up_tty_fops);
+ return filp && filp->f_op == &tty_fops &&
+ /* Accept race with __tty_hangup(). */
+ data_race(((struct tty_file_private *) filp->private_data)->hung);
}
EXPORT_SYMBOL(tty_hung_up_p);

@@ -911,6 +903,8 @@ static ssize_t tty_read(struct kiocb *iocb, struct iov_iter *to)
struct tty_struct *tty = file_tty(file);
struct tty_ldisc *ld;

+ if (tty_hung_up_p(file))
+ return hung_up_tty_read(iocb, to);
if (tty_paranoia_check(tty, inode, "tty_read"))
return -EIO;
if (!tty || tty_io_error(tty))
@@ -1073,6 +1067,8 @@ static ssize_t file_tty_write(struct file *file, struct kiocb *iocb, struct iov_
struct tty_ldisc *ld;
ssize_t ret;

+ if (tty_hung_up_p(file))
+ return hung_up_tty_write(iocb, from);
if (tty_paranoia_check(tty, file_inode(file), "tty_write"))
return -EIO;
if (!tty || !tty->ops->write || tty_io_error(tty))
@@ -2159,11 +2155,6 @@ static int tty_open(struct inode *inode, struct file *filp)
return retval;

schedule();
- /*
- * Need to reset f_op in case a hangup happened.
- */
- if (tty_hung_up_p(filp))
- filp->f_op = &tty_fops;
goto retry_open;
}
clear_bit(TTY_HUPPED, &tty->flags);
@@ -2197,6 +2188,8 @@ static __poll_t tty_poll(struct file *filp, poll_table *wait)
struct tty_ldisc *ld;
__poll_t ret = 0;

+ if (tty_hung_up_p(filp))
+ return hung_up_tty_poll(filp, wait);
if (tty_paranoia_check(tty, file_inode(filp), "tty_poll"))
return 0;

@@ -2249,6 +2242,8 @@ static int tty_fasync(int fd, struct file *filp, int on)
struct tty_struct *tty = file_tty(filp);
int retval = -ENOTTY;

+ if (tty_hung_up_p(filp))
+ return hung_up_tty_fasync(fd, filp, on);
tty_lock(tty);
if (!tty_hung_up_p(filp))
retval = __tty_fasync(fd, filp, on);
@@ -2658,6 +2653,8 @@ long tty_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
int retval;
struct tty_ldisc *ld;

+ if (tty_hung_up_p(file))
+ return hung_up_tty_ioctl(file, cmd, arg);
if (tty_paranoia_check(tty, file_inode(file), "tty_ioctl"))
return -EINVAL;

@@ -2943,6 +2940,8 @@ static long tty_compat_ioctl(struct file *file, unsigned int cmd,
return tty_ioctl(file, cmd, arg);
}

+ if (tty_hung_up_p(file))
+ return hung_up_tty_compat_ioctl(file, cmd, arg);
if (tty_paranoia_check(tty, file_inode(file), "tty_ioctl"))
return -EINVAL;

diff --git a/include/linux/tty.h b/include/linux/tty.h
index 093935e97f42..d7fa18f8c526 100644
--- a/include/linux/tty.h
+++ b/include/linux/tty.h
@@ -255,6 +255,7 @@ struct tty_file_private {
struct tty_struct *tty;
struct file *file;
struct list_head list;
+ bool hung;
};

/**
--
2.34.1

2023-04-28 16:42:43

by Nathan Chancellor

[permalink] [raw]
Subject: Re: [PATCH] tty: tty_io: remove hung_up_tty_fops

Hi Tetsuo,

On Wed, Apr 26, 2023 at 08:05:03PM +0900, Tetsuo Handa wrote:
> syzbot is reporting data race between __tty_hangup() and __fput(), for
> filp->f_op readers are not holding tty->files_lock.
>
> Dmitry Vyukov mentioned that this race has possibility of NULL pointer
> dereference, for tty_fops implements e.g. splice_read callback whereas
> hung_up_tty_fops does not.
>
> CPU0 CPU1
> ---- ----
> do_splice_to() {
> __tty_hangup() {
> // f_op->splice_read was generic_file_splice_read
> if (unlikely(!in->f_op->splice_read))
> return warn_unsupported(in, "read");
> filp->f_op = &hung_up_tty_fops;
> // f_op->splice_read is now NULL
> return in->f_op->splice_read(in, ppos, pipe, len, flags);
> }
> }
>
> If we care about only NULL pointer dereference, implementing missing
> callbacks to hung_up_tty_fops is fine. But if we also care about KCSAN
> reports, we will need to wrap all filp->f_op usages which are reachable
> via tty_fops callbacks using data_race().
>
> Such wrapping is overkill as a fix for tty_io code. Therefore, instead of
> implementing missing callbacks, stop updating filp->f_op and remove
> hung_up_tty_fops. Then, changes will be limited to within tty_io code.
>
> tty_open() is doing "filp->f_op = &tty_fops;".
> __tty_hangup() is doing "filp->f_op = &hung_up_tty_fops;".
> tty_hung_up_p() is doing "filp->f_op == &hung_up_tty_fops", and
> most functions are checking tty_hung_up_p().
>
> Since tty_open() allocates "struct tty_file_private" for each
> "struct file", we can remember whether __tty_hangup() was called
> by adding a flag to "struct tty_file_private".
>
> Below is detail of where/what to change.
>
> Regarding __tty_hangup(), replace "filp->f_op = &hung_up_tty_fops;" with
> setting the above-mentioned flag.
>
> Regarding tty_hungup_p(), replace "filp->f_op == &hung_up_tty_fops" with
> "filp->f_op == &tty_fops" and check the above-mentioned flag.
>
> Regarding tty_open(), just remove "filp->f_op = &tty_fops;" because
> "struct tty_file_private" was already released by tty_del_file() from
> tty_release() when control reaches this line.
>
> Regarding tty_{read,write,poll,ioctl,compat_ioctl}(), respectively embed
> hung_up_tty_{read,write,poll,ioctl,compat_ioctl}() right before
> tty_paranoia_check().
>
> Regarding tty_fasync(), embed hung_up_tty_fasync() right before tty_lock()
> in tty_fasync() rather than tty_paranoia_check() in __tty_fasync(), for
> tty_hung_up_p() is checked right after tty_lock() returned.
>
> Since tty_read() is called via file->f_op->read_iter() from
> call_read_iter() from generic_file_splice_read(), no change is needed for
> tty_fops->splice_read.
>
> Since tty_write() is called via file->f_op->write_iter() from
> call_write_iter() from do_iter_readv_writev() from do_iter_write() from
> vfs_iter_write() from iter_file_splice_write(), no change is needed for
> tty_fops->splice_write.
>
> Since both tty_fops and hung_up_tty_fops point to the same callback for
> llseek/release, no change is needed for tty_fops->{llseek,release}.
>
> Finally, remove hung_up_tty_fops.
>
> Reported-by: syzbot <[email protected]>
> Link: https://syzkaller.appspot.com/bug?extid=b7c3ba8cdc2f6cf83c21
> Signed-off-by: Tetsuo Handa <[email protected]>
> Cc: Dmitry Vyukov <[email protected]>
> ---
> drivers/tty/tty_io.c | 37 ++++++++++++++++++-------------------
> include/linux/tty.h | 1 +
> 2 files changed, 19 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
> index 36fb945fdad4..c1748e4da233 100644
> --- a/drivers/tty/tty_io.c
> +++ b/drivers/tty/tty_io.c
> @@ -182,7 +182,7 @@ int tty_alloc_file(struct file *file)
> {
> struct tty_file_private *priv;
>
> - priv = kmalloc(sizeof(*priv), GFP_KERNEL);
> + priv = kzalloc(sizeof(*priv), GFP_KERNEL);
> if (!priv)
> return -ENOMEM;
>
> @@ -491,17 +491,6 @@ static const struct file_operations console_fops = {
> .fasync = tty_fasync,
> };
>
> -static const struct file_operations hung_up_tty_fops = {
> - .llseek = no_llseek,
> - .read_iter = hung_up_tty_read,
> - .write_iter = hung_up_tty_write,
> - .poll = hung_up_tty_poll,
> - .unlocked_ioctl = hung_up_tty_ioctl,
> - .compat_ioctl = hung_up_tty_compat_ioctl,
> - .release = tty_release,
> - .fasync = hung_up_tty_fasync,
> -};
> -
> static DEFINE_SPINLOCK(redirect_lock);
> static struct file *redirect;
>
> @@ -619,7 +608,8 @@ static void __tty_hangup(struct tty_struct *tty, int exit_session)
> continue;
> closecount++;
> __tty_fasync(-1, filp, 0); /* can't block */
> - filp->f_op = &hung_up_tty_fops;
> + /* Accept race with tty_hung_up_p() test. */
> + data_race(priv->hung = true);
> }
> spin_unlock(&tty->files_lock);
>
> @@ -743,7 +733,9 @@ void tty_vhangup_session(struct tty_struct *tty)
> */
> int tty_hung_up_p(struct file *filp)
> {
> - return (filp && filp->f_op == &hung_up_tty_fops);
> + return filp && filp->f_op == &tty_fops &&
> + /* Accept race with __tty_hangup(). */
> + data_race(((struct tty_file_private *) filp->private_data)->hung);
> }
> EXPORT_SYMBOL(tty_hung_up_p);
>
> @@ -911,6 +903,8 @@ static ssize_t tty_read(struct kiocb *iocb, struct iov_iter *to)
> struct tty_struct *tty = file_tty(file);
> struct tty_ldisc *ld;
>
> + if (tty_hung_up_p(file))
> + return hung_up_tty_read(iocb, to);
> if (tty_paranoia_check(tty, inode, "tty_read"))
> return -EIO;
> if (!tty || tty_io_error(tty))
> @@ -1073,6 +1067,8 @@ static ssize_t file_tty_write(struct file *file, struct kiocb *iocb, struct iov_
> struct tty_ldisc *ld;
> ssize_t ret;
>
> + if (tty_hung_up_p(file))
> + return hung_up_tty_write(iocb, from);
> if (tty_paranoia_check(tty, file_inode(file), "tty_write"))
> return -EIO;
> if (!tty || !tty->ops->write || tty_io_error(tty))
> @@ -2159,11 +2155,6 @@ static int tty_open(struct inode *inode, struct file *filp)
> return retval;
>
> schedule();
> - /*
> - * Need to reset f_op in case a hangup happened.
> - */
> - if (tty_hung_up_p(filp))
> - filp->f_op = &tty_fops;
> goto retry_open;
> }
> clear_bit(TTY_HUPPED, &tty->flags);
> @@ -2197,6 +2188,8 @@ static __poll_t tty_poll(struct file *filp, poll_table *wait)
> struct tty_ldisc *ld;
> __poll_t ret = 0;
>
> + if (tty_hung_up_p(filp))
> + return hung_up_tty_poll(filp, wait);
> if (tty_paranoia_check(tty, file_inode(filp), "tty_poll"))
> return 0;
>
> @@ -2249,6 +2242,8 @@ static int tty_fasync(int fd, struct file *filp, int on)
> struct tty_struct *tty = file_tty(filp);
> int retval = -ENOTTY;
>
> + if (tty_hung_up_p(filp))
> + return hung_up_tty_fasync(fd, filp, on);
> tty_lock(tty);
> if (!tty_hung_up_p(filp))
> retval = __tty_fasync(fd, filp, on);
> @@ -2658,6 +2653,8 @@ long tty_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
> int retval;
> struct tty_ldisc *ld;
>
> + if (tty_hung_up_p(file))
> + return hung_up_tty_ioctl(file, cmd, arg);
> if (tty_paranoia_check(tty, file_inode(file), "tty_ioctl"))
> return -EINVAL;
>
> @@ -2943,6 +2940,8 @@ static long tty_compat_ioctl(struct file *file, unsigned int cmd,
> return tty_ioctl(file, cmd, arg);
> }
>
> + if (tty_hung_up_p(file))
> + return hung_up_tty_compat_ioctl(file, cmd, arg);
> if (tty_paranoia_check(tty, file_inode(file), "tty_ioctl"))
> return -EINVAL;
>
> diff --git a/include/linux/tty.h b/include/linux/tty.h
> index 093935e97f42..d7fa18f8c526 100644
> --- a/include/linux/tty.h
> +++ b/include/linux/tty.h
> @@ -255,6 +255,7 @@ struct tty_file_private {
> struct tty_struct *tty;
> struct file *file;
> struct list_head list;
> + bool hung;
> };
>
> /**
> --
> 2.34.1
>

I see this change has shown up in -next as commit 4c87e9e5479b ("tty:
tty_io: remove hung_up_tty_fops"), where it causes the following warning
for configurations without CONFIG_COMPAT (I used ARCH=arm defconfig):

drivers/tty/tty_io.c:446:13: warning: 'hung_up_tty_compat_ioctl' defined but not used [-Wunused-function]
446 | static long hung_up_tty_compat_ioctl(struct file *file,
| ^~~~~~~~~~~~~~~~~~~~~~~~

I am not sure if you just added that patch for additional test coverage
or for final acceptance but the following diff resolves this warning for
me, perhaps it can be folded in for a v2?

Cheers,
Nathan

diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index 553182753098..9588ee356646 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -443,12 +443,6 @@ static long hung_up_tty_ioctl(struct file *file, unsigned int cmd,
return cmd == TIOCSPGRP ? -ENOTTY : -EIO;
}

-static long hung_up_tty_compat_ioctl(struct file *file,
- unsigned int cmd, unsigned long arg)
-{
- return cmd == TIOCSPGRP ? -ENOTTY : -EIO;
-}
-
static int hung_up_tty_fasync(int fd, struct file *file, int on)
{
return -ENOTTY;
@@ -2812,6 +2806,12 @@ struct serial_struct32 {
compat_int_t reserved;
};

+static long hung_up_tty_compat_ioctl(struct file *file,
+ unsigned int cmd, unsigned long arg)
+{
+ return cmd == TIOCSPGRP ? -ENOTTY : -EIO;
+}
+
static int compat_tty_tiocsserial(struct tty_struct *tty,
struct serial_struct32 __user *ss)
{

2023-04-28 16:52:02

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [PATCH] tty: tty_io: remove hung_up_tty_fops

On 2023/04/29 1:27, Nathan Chancellor wrote:
> I see this change has shown up in -next as commit 4c87e9e5479b ("tty:
> tty_io: remove hung_up_tty_fops"), where it causes the following warning
> for configurations without CONFIG_COMPAT (I used ARCH=arm defconfig):
>
> drivers/tty/tty_io.c:446:13: warning: 'hung_up_tty_compat_ioctl' defined but not used [-Wunused-function]
> 446 | static long hung_up_tty_compat_ioctl(struct file *file,
> | ^~~~~~~~~~~~~~~~~~~~~~~~
>
> I am not sure if you just added that patch for additional test coverage
> or for final acceptance but the following diff resolves this warning for
> me, perhaps it can be folded in for a v2?

Thank you for reporting. Yes, moving the definition will solve the warning.

I'm testing this patch using my volatile tree before getting merged to a
permanent tree.

2023-04-28 17:13:57

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH] tty: tty_io: remove hung_up_tty_fops

On Sat, Apr 29, 2023 at 01:41:02AM +0900, Tetsuo Handa wrote:
> On 2023/04/29 1:27, Nathan Chancellor wrote:
> > I see this change has shown up in -next as commit 4c87e9e5479b ("tty:
> > tty_io: remove hung_up_tty_fops"), where it causes the following warning
> > for configurations without CONFIG_COMPAT (I used ARCH=arm defconfig):
> >
> > drivers/tty/tty_io.c:446:13: warning: 'hung_up_tty_compat_ioctl' defined but not used [-Wunused-function]
> > 446 | static long hung_up_tty_compat_ioctl(struct file *file,
> > | ^~~~~~~~~~~~~~~~~~~~~~~~
> >
> > I am not sure if you just added that patch for additional test coverage
> > or for final acceptance but the following diff resolves this warning for
> > me, perhaps it can be folded in for a v2?
>
> Thank you for reporting. Yes, moving the definition will solve the warning.

IDGI... Why do you need to keep that function at all? Compare it
with hung_up_tty_ioctl() - they are token-for-token identical; the only
difference is the function name...

2023-04-28 17:43:23

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] tty: tty_io: remove hung_up_tty_fops

On Sat, Apr 29, 2023 at 01:41:02AM +0900, Tetsuo Handa wrote:
> On 2023/04/29 1:27, Nathan Chancellor wrote:
> > I see this change has shown up in -next as commit 4c87e9e5479b ("tty:
> > tty_io: remove hung_up_tty_fops"), where it causes the following warning
> > for configurations without CONFIG_COMPAT (I used ARCH=arm defconfig):
> >
> > drivers/tty/tty_io.c:446:13: warning: 'hung_up_tty_compat_ioctl' defined but not used [-Wunused-function]
> > 446 | static long hung_up_tty_compat_ioctl(struct file *file,
> > | ^~~~~~~~~~~~~~~~~~~~~~~~
> >
> > I am not sure if you just added that patch for additional test coverage
> > or for final acceptance but the following diff resolves this warning for
> > me, perhaps it can be folded in for a v2?
>
> Thank you for reporting. Yes, moving the definition will solve the warning.
>
> I'm testing this patch using my volatile tree before getting merged to a
> permanent tree.

How are you adding new stuff to linux-next right now when the tree
should be "closed"? And I haven't reviewed this yet, what tree has
picked it up?

Please wait until after -rc1 is out for new stuff, you know this...

thanks,

greg k-h

2023-04-29 10:46:24

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [PATCH] tty: tty_io: remove hung_up_tty_fops

On 2023/04/29 2:11, Al Viro wrote:
> On Sat, Apr 29, 2023 at 01:41:02AM +0900, Tetsuo Handa wrote:
>> On 2023/04/29 1:27, Nathan Chancellor wrote:
>>> I see this change has shown up in -next as commit 4c87e9e5479b ("tty:
>>> tty_io: remove hung_up_tty_fops"), where it causes the following warning
>>> for configurations without CONFIG_COMPAT (I used ARCH=arm defconfig):
>>>
>>> drivers/tty/tty_io.c:446:13: warning: 'hung_up_tty_compat_ioctl' defined but not used [-Wunused-function]
>>> 446 | static long hung_up_tty_compat_ioctl(struct file *file,
>>> | ^~~~~~~~~~~~~~~~~~~~~~~~
>>>
>>> I am not sure if you just added that patch for additional test coverage
>>> or for final acceptance but the following diff resolves this warning for
>>> me, perhaps it can be folded in for a v2?
>>
>> Thank you for reporting. Yes, moving the definition will solve the warning.
>
> IDGI... Why do you need to keep that function at all? Compare it
> with hung_up_tty_ioctl() - they are token-for-token identical; the only
> difference is the function name...

Indeed hung_up_tty_ioctl() and hung_up_tty_compat_ioctl() are identical.
We can remove hung_up_tty_compat_ioctl() if we don't consider

ld = tty_ldisc_ref_wait(tty);
if (!ld)
- return hung_up_tty_compat_ioctl(file, cmd, arg);
+ return hung_up_tty_ioctl(file, cmd, arg);
if (ld->ops->compat_ioctl)
retval = ld->ops->compat_ioctl(tty, cmd, arg);

at tty_compat_ioctl() as strange looking.

Maybe adding "inline" keyword to hung_up_tty_*() definitions
because hung_up_tty_fops is removed?

2023-04-29 15:53:10

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH] tty: tty_io: remove hung_up_tty_fops

On Fri, Apr 28, 2023 at 07:31:35PM +0200, Greg Kroah-Hartman wrote:
> On Sat, Apr 29, 2023 at 01:41:02AM +0900, Tetsuo Handa wrote:
> > On 2023/04/29 1:27, Nathan Chancellor wrote:
> > > I see this change has shown up in -next as commit 4c87e9e5479b ("tty:
> > > tty_io: remove hung_up_tty_fops"), where it causes the following warning
> > > for configurations without CONFIG_COMPAT (I used ARCH=arm defconfig):
> > >
> > > drivers/tty/tty_io.c:446:13: warning: 'hung_up_tty_compat_ioctl' defined but not used [-Wunused-function]
> > > 446 | static long hung_up_tty_compat_ioctl(struct file *file,
> > > | ^~~~~~~~~~~~~~~~~~~~~~~~
> > >
> > > I am not sure if you just added that patch for additional test coverage
> > > or for final acceptance but the following diff resolves this warning for
> > > me, perhaps it can be folded in for a v2?
> >
> > Thank you for reporting. Yes, moving the definition will solve the warning.
> >
> > I'm testing this patch using my volatile tree before getting merged to a
> > permanent tree.
>
> How are you adding new stuff to linux-next right now when the tree
> should be "closed"? And I haven't reviewed this yet, what tree has

That happens all the time, and quite often the "late" additions are applied
to mainline immediately afterwards and end up causing problems there.
I can only hope that this won't happen with this one.

> picked it up?
>
> Please wait until after -rc1 is out for new stuff, you know this...
>

This patch is supposed to fix a data race, so maybe it wasn't considered
"new". Just guessing, of course.

Guenter

2023-05-01 18:45:31

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH] tty: tty_io: remove hung_up_tty_fops

On Sat, Apr 29, 2023 at 5:56 PM Guenter Roeck <[email protected]> wrote:
> On Fri, Apr 28, 2023 at 07:31:35PM +0200, Greg Kroah-Hartman wrote:
> > On Sat, Apr 29, 2023 at 01:41:02AM +0900, Tetsuo Handa wrote:
> > > On 2023/04/29 1:27, Nathan Chancellor wrote:
> > > > I see this change has shown up in -next as commit 4c87e9e5479b ("tty:
> > > > tty_io: remove hung_up_tty_fops"), where it causes the following warning
> > > > for configurations without CONFIG_COMPAT (I used ARCH=arm defconfig):
> > > >
> > > > drivers/tty/tty_io.c:446:13: warning: 'hung_up_tty_compat_ioctl' defined but not used [-Wunused-function]
> > > > 446 | static long hung_up_tty_compat_ioctl(struct file *file,
> > > > | ^~~~~~~~~~~~~~~~~~~~~~~~
> > > >
> > > > I am not sure if you just added that patch for additional test coverage
> > > > or for final acceptance but the following diff resolves this warning for
> > > > me, perhaps it can be folded in for a v2?
> > >
> > > Thank you for reporting. Yes, moving the definition will solve the warning.
> > >
> > > I'm testing this patch using my volatile tree before getting merged to a
> > > permanent tree.
> >
> > How are you adding new stuff to linux-next right now when the tree
> > should be "closed"? And I haven't reviewed this yet, what tree has
>
> That happens all the time, and quite often the "late" additions are applied
> to mainline immediately afterwards and end up causing problems there.
> I can only hope that this won't happen with this one.

And it might be picked up by stable ;-(

> > picked it up?
> >
> > Please wait until after -rc1 is out for new stuff, you know this...
>
> This patch is supposed to fix a data race, so maybe it wasn't considered
> "new". Just guessing, of course.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2023-05-14 01:39:04

by Tetsuo Handa

[permalink] [raw]
Subject: [PATCH v2] tty: tty_io: remove hung_up_tty_fops

syzbot is reporting data race between __tty_hangup() and __fput(), for
filp->f_op readers are not holding tty->files_lock.

Dmitry Vyukov mentioned that this race has possibility of NULL pointer
dereference, for tty_fops implements e.g. splice_read callback whereas
hung_up_tty_fops does not.

CPU0 CPU1
---- ----
do_splice_to() {
__tty_hangup() {
// f_op->splice_read was generic_file_splice_read
if (unlikely(!in->f_op->splice_read))
return warn_unsupported(in, "read");
filp->f_op = &hung_up_tty_fops;
// f_op->splice_read is now NULL
return in->f_op->splice_read(in, ppos, pipe, len, flags);
}
}

If we care about only NULL pointer dereference, implementing missing
callbacks to hung_up_tty_fops is fine. But if we also care about KCSAN
reports, we will need to wrap all filp->f_op usages which are reachable
via tty_fops callbacks using data_race().

Such wrapping is overkill as a fix for tty_io code. Therefore, instead of
implementing missing callbacks, stop updating filp->f_op and remove
hung_up_tty_fops. Then, changes will be limited to within tty_io code.

tty_open() is doing "filp->f_op = &tty_fops;".
__tty_hangup() is doing "filp->f_op = &hung_up_tty_fops;".
tty_hung_up_p() is doing "filp->f_op == &hung_up_tty_fops", and
most functions are checking tty_hung_up_p().

Since tty_open() allocates "struct tty_file_private" for each
"struct file", we can remember whether __tty_hangup() was called
by adding a flag to "struct tty_file_private".

Below is detail of where/what to change.

Regarding __tty_hangup(), replace "filp->f_op = &hung_up_tty_fops;" with
setting the above-mentioned flag.

Regarding tty_hungup_p(), replace "filp->f_op == &hung_up_tty_fops" with
"filp->f_op == &tty_fops" and check the above-mentioned flag.

Regarding tty_open(), just remove "filp->f_op = &tty_fops;" because
"struct tty_file_private" was already released by tty_del_file() from
tty_release() when control reaches this line.

Regarding tty_{read,write,poll,ioctl,compat_ioctl}(), respectively embed
hung_up_tty_{read,write,poll,ioctl,compat_ioctl}() right before
tty_paranoia_check().

Regarding tty_fasync(), embed hung_up_tty_fasync() right before tty_lock()
in tty_fasync() rather than tty_paranoia_check() in __tty_fasync(), for
tty_hung_up_p() is checked right after tty_lock() returned.

Since tty_read() is called via file->f_op->read_iter() from
call_read_iter() from generic_file_splice_read(), no change is needed for
tty_fops->splice_read.

Since tty_write() is called via file->f_op->write_iter() from
call_write_iter() from do_iter_readv_writev() from do_iter_write() from
vfs_iter_write() from iter_file_splice_write(), no change is needed for
tty_fops->splice_write.

Since both tty_fops and hung_up_tty_fops point to the same callback for
llseek/release, no change is needed for tty_fops->{llseek,release}.

Finally, remove hung_up_tty_fops and mark callbacks used by
hung_up_tty_fops as "inline".

Reported-by: syzbot <[email protected]>
Closes: https://syzkaller.appspot.com/bug?extid=b7c3ba8cdc2f6cf83c21
Signed-off-by: Tetsuo Handa <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
---
Changes in v2:
Mark callbacks used by hung_up_tty_fops as "inline" in order to fix
-Wunused-function warning when CONFIG_COMPAT=n, reported by
Nathan Chancellor <[email protected]> and Arnd Bergmann <[email protected]>.

drivers/tty/tty_io.c | 49 ++++++++++++++++++++++----------------------
include/linux/tty.h | 1 +
2 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index c84be40fb8df..bff0a2ffa68f 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -182,7 +182,7 @@ int tty_alloc_file(struct file *file)
{
struct tty_file_private *priv;

- priv = kmalloc(sizeof(*priv), GFP_KERNEL);
+ priv = kzalloc(sizeof(*priv), GFP_KERNEL);
if (!priv)
return -ENOMEM;

@@ -421,35 +421,35 @@ struct tty_driver *tty_find_polling_driver(char *name, int *line)
EXPORT_SYMBOL_GPL(tty_find_polling_driver);
#endif

-static ssize_t hung_up_tty_read(struct kiocb *iocb, struct iov_iter *to)
+static inline ssize_t hung_up_tty_read(struct kiocb *iocb, struct iov_iter *to)
{
return 0;
}

-static ssize_t hung_up_tty_write(struct kiocb *iocb, struct iov_iter *from)
+static inline ssize_t hung_up_tty_write(struct kiocb *iocb, struct iov_iter *from)
{
return -EIO;
}

/* No kernel lock held - none needed ;) */
-static __poll_t hung_up_tty_poll(struct file *filp, poll_table *wait)
+static inline __poll_t hung_up_tty_poll(struct file *filp, poll_table *wait)
{
return EPOLLIN | EPOLLOUT | EPOLLERR | EPOLLHUP | EPOLLRDNORM | EPOLLWRNORM;
}

-static long hung_up_tty_ioctl(struct file *file, unsigned int cmd,
+static inline long hung_up_tty_ioctl(struct file *file, unsigned int cmd,
unsigned long arg)
{
return cmd == TIOCSPGRP ? -ENOTTY : -EIO;
}

-static long hung_up_tty_compat_ioctl(struct file *file,
+static inline long hung_up_tty_compat_ioctl(struct file *file,
unsigned int cmd, unsigned long arg)
{
return cmd == TIOCSPGRP ? -ENOTTY : -EIO;
}

-static int hung_up_tty_fasync(int fd, struct file *file, int on)
+static inline int hung_up_tty_fasync(int fd, struct file *file, int on)
{
return -ENOTTY;
}
@@ -491,17 +491,6 @@ static const struct file_operations console_fops = {
.fasync = tty_fasync,
};

-static const struct file_operations hung_up_tty_fops = {
- .llseek = no_llseek,
- .read_iter = hung_up_tty_read,
- .write_iter = hung_up_tty_write,
- .poll = hung_up_tty_poll,
- .unlocked_ioctl = hung_up_tty_ioctl,
- .compat_ioctl = hung_up_tty_compat_ioctl,
- .release = tty_release,
- .fasync = hung_up_tty_fasync,
-};
-
static DEFINE_SPINLOCK(redirect_lock);
static struct file *redirect;

@@ -619,7 +608,8 @@ static void __tty_hangup(struct tty_struct *tty, int exit_session)
continue;
closecount++;
__tty_fasync(-1, filp, 0); /* can't block */
- filp->f_op = &hung_up_tty_fops;
+ /* Accept race with tty_hung_up_p() test. */
+ data_race(priv->hung = true);
}
spin_unlock(&tty->files_lock);

@@ -743,7 +733,9 @@ void tty_vhangup_session(struct tty_struct *tty)
*/
int tty_hung_up_p(struct file *filp)
{
- return (filp && filp->f_op == &hung_up_tty_fops);
+ return filp && filp->f_op == &tty_fops &&
+ /* Accept race with __tty_hangup(). */
+ data_race(((struct tty_file_private *) filp->private_data)->hung);
}
EXPORT_SYMBOL(tty_hung_up_p);

@@ -911,6 +903,8 @@ static ssize_t tty_read(struct kiocb *iocb, struct iov_iter *to)
struct tty_struct *tty = file_tty(file);
struct tty_ldisc *ld;

+ if (tty_hung_up_p(file))
+ return hung_up_tty_read(iocb, to);
if (tty_paranoia_check(tty, inode, "tty_read"))
return -EIO;
if (!tty || tty_io_error(tty))
@@ -1073,6 +1067,8 @@ static ssize_t file_tty_write(struct file *file, struct kiocb *iocb, struct iov_
struct tty_ldisc *ld;
ssize_t ret;

+ if (tty_hung_up_p(file))
+ return hung_up_tty_write(iocb, from);
if (tty_paranoia_check(tty, file_inode(file), "tty_write"))
return -EIO;
if (!tty || !tty->ops->write || tty_io_error(tty))
@@ -2159,11 +2155,6 @@ static int tty_open(struct inode *inode, struct file *filp)
return retval;

schedule();
- /*
- * Need to reset f_op in case a hangup happened.
- */
- if (tty_hung_up_p(filp))
- filp->f_op = &tty_fops;
goto retry_open;
}
clear_bit(TTY_HUPPED, &tty->flags);
@@ -2197,6 +2188,8 @@ static __poll_t tty_poll(struct file *filp, poll_table *wait)
struct tty_ldisc *ld;
__poll_t ret = 0;

+ if (tty_hung_up_p(filp))
+ return hung_up_tty_poll(filp, wait);
if (tty_paranoia_check(tty, file_inode(filp), "tty_poll"))
return 0;

@@ -2249,6 +2242,8 @@ static int tty_fasync(int fd, struct file *filp, int on)
struct tty_struct *tty = file_tty(filp);
int retval = -ENOTTY;

+ if (tty_hung_up_p(filp))
+ return hung_up_tty_fasync(fd, filp, on);
tty_lock(tty);
if (!tty_hung_up_p(filp))
retval = __tty_fasync(fd, filp, on);
@@ -2658,6 +2653,8 @@ long tty_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
int retval;
struct tty_ldisc *ld;

+ if (tty_hung_up_p(file))
+ return hung_up_tty_ioctl(file, cmd, arg);
if (tty_paranoia_check(tty, file_inode(file), "tty_ioctl"))
return -EINVAL;

@@ -2943,6 +2940,8 @@ static long tty_compat_ioctl(struct file *file, unsigned int cmd,
return tty_ioctl(file, cmd, arg);
}

+ if (tty_hung_up_p(file))
+ return hung_up_tty_compat_ioctl(file, cmd, arg);
if (tty_paranoia_check(tty, file_inode(file), "tty_ioctl"))
return -EINVAL;

diff --git a/include/linux/tty.h b/include/linux/tty.h
index e8d5d9997aca..9ea30598c576 100644
--- a/include/linux/tty.h
+++ b/include/linux/tty.h
@@ -255,6 +255,7 @@ struct tty_file_private {
struct tty_struct *tty;
struct file *file;
struct list_head list;
+ bool hung;
};

/**
--
2.18.4


2023-05-30 11:07:03

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v2] tty: tty_io: remove hung_up_tty_fops

On Sun, May 14, 2023 at 10:02:26AM +0900, Tetsuo Handa wrote:
> syzbot is reporting data race between __tty_hangup() and __fput(), for
> filp->f_op readers are not holding tty->files_lock.
>
> Dmitry Vyukov mentioned that this race has possibility of NULL pointer
> dereference, for tty_fops implements e.g. splice_read callback whereas
> hung_up_tty_fops does not.
>
> CPU0 CPU1
> ---- ----
> do_splice_to() {
> __tty_hangup() {
> // f_op->splice_read was generic_file_splice_read
> if (unlikely(!in->f_op->splice_read))
> return warn_unsupported(in, "read");
> filp->f_op = &hung_up_tty_fops;
> // f_op->splice_read is now NULL
> return in->f_op->splice_read(in, ppos, pipe, len, flags);
> }
> }
>
> If we care about only NULL pointer dereference, implementing missing
> callbacks to hung_up_tty_fops is fine. But if we also care about KCSAN
> reports, we will need to wrap all filp->f_op usages which are reachable
> via tty_fops callbacks using data_race().

I'm missing something here. Why would KCSAN report problems if we
implement the needed callbacks in hung_up_tty_fops? And what reports
would they be?

And why would data_race() help here?


> Such wrapping is overkill as a fix for tty_io code. Therefore, instead of
> implementing missing callbacks, stop updating filp->f_op and remove
> hung_up_tty_fops. Then, changes will be limited to within tty_io code.
>
> tty_open() is doing "filp->f_op = &tty_fops;".
> __tty_hangup() is doing "filp->f_op = &hung_up_tty_fops;".
> tty_hung_up_p() is doing "filp->f_op == &hung_up_tty_fops", and
> most functions are checking tty_hung_up_p().
>
> Since tty_open() allocates "struct tty_file_private" for each
> "struct file", we can remember whether __tty_hangup() was called
> by adding a flag to "struct tty_file_private".
>
> Below is detail of where/what to change.
>
> Regarding __tty_hangup(), replace "filp->f_op = &hung_up_tty_fops;" with
> setting the above-mentioned flag.
>
> Regarding tty_hungup_p(), replace "filp->f_op == &hung_up_tty_fops" with
> "filp->f_op == &tty_fops" and check the above-mentioned flag.
>
> Regarding tty_open(), just remove "filp->f_op = &tty_fops;" because
> "struct tty_file_private" was already released by tty_del_file() from
> tty_release() when control reaches this line.
>
> Regarding tty_{read,write,poll,ioctl,compat_ioctl}(), respectively embed
> hung_up_tty_{read,write,poll,ioctl,compat_ioctl}() right before
> tty_paranoia_check().
>
> Regarding tty_fasync(), embed hung_up_tty_fasync() right before tty_lock()
> in tty_fasync() rather than tty_paranoia_check() in __tty_fasync(), for
> tty_hung_up_p() is checked right after tty_lock() returned.
>
> Since tty_read() is called via file->f_op->read_iter() from
> call_read_iter() from generic_file_splice_read(), no change is needed for
> tty_fops->splice_read.
>
> Since tty_write() is called via file->f_op->write_iter() from
> call_write_iter() from do_iter_readv_writev() from do_iter_write() from
> vfs_iter_write() from iter_file_splice_write(), no change is needed for
> tty_fops->splice_write.
>
> Since both tty_fops and hung_up_tty_fops point to the same callback for
> llseek/release, no change is needed for tty_fops->{llseek,release}.
>
> Finally, remove hung_up_tty_fops and mark callbacks used by
> hung_up_tty_fops as "inline".
>
> Reported-by: syzbot <[email protected]>
> Closes: https://syzkaller.appspot.com/bug?extid=b7c3ba8cdc2f6cf83c21
> Signed-off-by: Tetsuo Handa <[email protected]>
> Cc: Dmitry Vyukov <[email protected]>
> ---
> Changes in v2:
> Mark callbacks used by hung_up_tty_fops as "inline" in order to fix
> -Wunused-function warning when CONFIG_COMPAT=n, reported by
> Nathan Chancellor <[email protected]> and Arnd Bergmann <[email protected]>.
>
> drivers/tty/tty_io.c | 49 ++++++++++++++++++++++----------------------
> include/linux/tty.h | 1 +
> 2 files changed, 25 insertions(+), 25 deletions(-)
>
> diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
> index c84be40fb8df..bff0a2ffa68f 100644
> --- a/drivers/tty/tty_io.c
> +++ b/drivers/tty/tty_io.c
> @@ -182,7 +182,7 @@ int tty_alloc_file(struct file *file)
> {
> struct tty_file_private *priv;
>
> - priv = kmalloc(sizeof(*priv), GFP_KERNEL);
> + priv = kzalloc(sizeof(*priv), GFP_KERNEL);

Why is this zeroing out everything now? Just because you added one
bool? Why not just set the bool properly instead?

> if (!priv)
> return -ENOMEM;
>
> @@ -421,35 +421,35 @@ struct tty_driver *tty_find_polling_driver(char *name, int *line)
> EXPORT_SYMBOL_GPL(tty_find_polling_driver);
> #endif
>
> -static ssize_t hung_up_tty_read(struct kiocb *iocb, struct iov_iter *to)
> +static inline ssize_t hung_up_tty_read(struct kiocb *iocb, struct iov_iter *to)
> {
> return 0;
> }
>
> -static ssize_t hung_up_tty_write(struct kiocb *iocb, struct iov_iter *from)
> +static inline ssize_t hung_up_tty_write(struct kiocb *iocb, struct iov_iter *from)
> {
> return -EIO;
> }
>
> /* No kernel lock held - none needed ;) */
> -static __poll_t hung_up_tty_poll(struct file *filp, poll_table *wait)
> +static inline __poll_t hung_up_tty_poll(struct file *filp, poll_table *wait)
> {
> return EPOLLIN | EPOLLOUT | EPOLLERR | EPOLLHUP | EPOLLRDNORM | EPOLLWRNORM;
> }
>
> -static long hung_up_tty_ioctl(struct file *file, unsigned int cmd,
> +static inline long hung_up_tty_ioctl(struct file *file, unsigned int cmd,
> unsigned long arg)
> {
> return cmd == TIOCSPGRP ? -ENOTTY : -EIO;
> }
>
> -static long hung_up_tty_compat_ioctl(struct file *file,
> +static inline long hung_up_tty_compat_ioctl(struct file *file,
> unsigned int cmd, unsigned long arg)
> {
> return cmd == TIOCSPGRP ? -ENOTTY : -EIO;
> }

Marking these as inline, and then treating them as a function pointer,
seems like a horrid way to work around a compiler warning. As they
really are not inline functions anymore, but yet the compiler doesn't
know that. Odds are once the compiler gets smarter, the warnings will
return, so please, solve this properly.


>
> -static int hung_up_tty_fasync(int fd, struct file *file, int on)
> +static inline int hung_up_tty_fasync(int fd, struct file *file, int on)
> {
> return -ENOTTY;
> }
> @@ -491,17 +491,6 @@ static const struct file_operations console_fops = {
> .fasync = tty_fasync,
> };
>
> -static const struct file_operations hung_up_tty_fops = {
> - .llseek = no_llseek,
> - .read_iter = hung_up_tty_read,
> - .write_iter = hung_up_tty_write,
> - .poll = hung_up_tty_poll,
> - .unlocked_ioctl = hung_up_tty_ioctl,
> - .compat_ioctl = hung_up_tty_compat_ioctl,
> - .release = tty_release,
> - .fasync = hung_up_tty_fasync,
> -};
> -
> static DEFINE_SPINLOCK(redirect_lock);
> static struct file *redirect;
>
> @@ -619,7 +608,8 @@ static void __tty_hangup(struct tty_struct *tty, int exit_session)
> continue;
> closecount++;
> __tty_fasync(-1, filp, 0); /* can't block */
> - filp->f_op = &hung_up_tty_fops;
> + /* Accept race with tty_hung_up_p() test. */
> + data_race(priv->hung = true);

Why accept it? Say why it's not really an issue here.

> }
> spin_unlock(&tty->files_lock);
>
> @@ -743,7 +733,9 @@ void tty_vhangup_session(struct tty_struct *tty)
> */
> int tty_hung_up_p(struct file *filp)
> {
> - return (filp && filp->f_op == &hung_up_tty_fops);
> + return filp && filp->f_op == &tty_fops &&
> + /* Accept race with __tty_hangup(). */
> + data_race(((struct tty_file_private *) filp->private_data)->hung);

Same here.

> }
> EXPORT_SYMBOL(tty_hung_up_p);
>
> @@ -911,6 +903,8 @@ static ssize_t tty_read(struct kiocb *iocb, struct iov_iter *to)
> struct tty_struct *tty = file_tty(file);
> struct tty_ldisc *ld;
>
> + if (tty_hung_up_p(file))
> + return hung_up_tty_read(iocb, to);

What happens if you hang up _right_ after this check? There's no
locking here, right? Same everywhere else you have this pattern, you
made the race window smaller, but it's still there from what I can see.

> --- a/include/linux/tty.h
> +++ b/include/linux/tty.h
> @@ -255,6 +255,7 @@ struct tty_file_private {
> struct tty_struct *tty;
> struct file *file;
> struct list_head list;
> + bool hung;

No hint as to what "hung" means here?

thanks,

greg k-h

2023-05-30 12:07:20

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [PATCH v2] tty: tty_io: remove hung_up_tty_fops

On 2023/05/30 19:44, Greg Kroah-Hartman wrote:
> On Sun, May 14, 2023 at 10:02:26AM +0900, Tetsuo Handa wrote:
>> If we care about only NULL pointer dereference, implementing missing
>> callbacks to hung_up_tty_fops is fine. But if we also care about KCSAN
>> reports, we will need to wrap all filp->f_op usages which are reachable
>> via tty_fops callbacks using data_race().
>
> I'm missing something here. Why would KCSAN report problems if we
> implement the needed callbacks in hung_up_tty_fops? And what reports
> would they be?

Unlike atomic operations such as atomic_read()/atomic_set(), normal read/write
operations are not atomic for KCSAN. KCSAN reports some value being changed
during a read/write.

In this report, KCSAN detected that __tty_hangup() changed the value of
filp->f_op from 0xffffffff84e91ed0 to 0xffffffff84e91dc0 at

filp->f_op = &hung_up_tty_fops;

line when __fput() was reading the value of filp->f_op at

if (file->f_op->release)

line.

Even if we implement the needed callbacks in hung_up_tty_fops,
KCSAN will continue reporting that the value of filp->f_op changes.

>
> And why would data_race() help here?

data_race() tells KCSAN not to report.
data_race() is used when the race KCSAN checks is harmless.



>> @@ -182,7 +182,7 @@ int tty_alloc_file(struct file *file)
>> {
>> struct tty_file_private *priv;
>>
>> - priv = kmalloc(sizeof(*priv), GFP_KERNEL);
>> + priv = kzalloc(sizeof(*priv), GFP_KERNEL);
>
> Why is this zeroing out everything now? Just because you added one
> bool? Why not just set the bool properly instead?

Because I consider that this function is not performance critical where
avoid increasing code size by zeroing out everything is acceptable.



>> -static long hung_up_tty_compat_ioctl(struct file *file,
>> +static inline long hung_up_tty_compat_ioctl(struct file *file,
>> unsigned int cmd, unsigned long arg)
>> {
>> return cmd == TIOCSPGRP ? -ENOTTY : -EIO;
>> }
>
> Marking these as inline, and then treating them as a function pointer,
> seems like a horrid way to work around a compiler warning. As they
> really are not inline functions anymore, but yet the compiler doesn't
> know that. Odds are once the compiler gets smarter, the warnings will
> return, so please, solve this properly.

Since this patch removes "struct file_operations hung_up_tty_fops"
which was the only source of treating as a function pointer,
these inlined functions are no longer treated as a function pointer.



>> @@ -619,7 +608,8 @@ static void __tty_hangup(struct tty_struct *tty, int exit_session)
>> continue;
>> closecount++;
>> __tty_fasync(-1, filp, 0); /* can't block */
>> - filp->f_op = &hung_up_tty_fops;
>> + /* Accept race with tty_hung_up_p() test. */
>> + data_race(priv->hung = true);
>
> Why accept it? Say why it's not really an issue here.

Because whether tty_hung_up_p() sees true or false due to concurrent
access does not matter. The race KCSAN reported is harmless (unless
callbacks suddenly disappear).



>> @@ -743,7 +733,9 @@ void tty_vhangup_session(struct tty_struct *tty)
>> */
>> int tty_hung_up_p(struct file *filp)
>> {
>> - return (filp && filp->f_op == &hung_up_tty_fops);
>> + return filp && filp->f_op == &tty_fops &&
>> + /* Accept race with __tty_hangup(). */
>> + data_race(((struct tty_file_private *) filp->private_data)->hung);
>
> Same here.

Because whether __tty_hangup() already changed from false to true due to
concurrent access does not matter. The race KCSAN reported is harmless (unless
callbacks suddenly disappear).



>> @@ -911,6 +903,8 @@ static ssize_t tty_read(struct kiocb *iocb, struct iov_iter *to)
>> struct tty_struct *tty = file_tty(file);
>> struct tty_ldisc *ld;
>>
>> + if (tty_hung_up_p(file))
>> + return hung_up_tty_read(iocb, to);
>
> What happens if you hang up _right_ after this check? There's no
> locking here, right? Same everywhere else you have this pattern, you
> made the race window smaller, but it's still there from what I can see.

We cannot close the race window without introducing locking,
but we don't need to close the race window.

The race KCSAN found in this report is harmless, as long as callbacks
reachable via filp->f_op does not disappear.

This patch prevents filp->f_op from suddenly disappearing callbacks,
by not changing the value of filp->f_op.



>> @@ -255,6 +255,7 @@ struct tty_file_private {
>> struct tty_struct *tty;
>> struct file *file;
>> struct list_head list;
>> + bool hung;
>
> No hint as to what "hung" means here?

Whether __tty_hangup() was called or not.


2023-05-30 13:20:00

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v2] tty: tty_io: remove hung_up_tty_fops

On Tue, May 30, 2023 at 08:57:42PM +0900, Tetsuo Handa wrote:
> On 2023/05/30 19:44, Greg Kroah-Hartman wrote:
> > On Sun, May 14, 2023 at 10:02:26AM +0900, Tetsuo Handa wrote:
> >> If we care about only NULL pointer dereference, implementing missing
> >> callbacks to hung_up_tty_fops is fine. But if we also care about KCSAN
> >> reports, we will need to wrap all filp->f_op usages which are reachable
> >> via tty_fops callbacks using data_race().
> >
> > I'm missing something here. Why would KCSAN report problems if we
> > implement the needed callbacks in hung_up_tty_fops? And what reports
> > would they be?
>
> Unlike atomic operations such as atomic_read()/atomic_set(), normal read/write
> operations are not atomic for KCSAN. KCSAN reports some value being changed
> during a read/write.
>
> In this report, KCSAN detected that __tty_hangup() changed the value of
> filp->f_op from 0xffffffff84e91ed0 to 0xffffffff84e91dc0 at
>
> filp->f_op = &hung_up_tty_fops;
>
> line when __fput() was reading the value of filp->f_op at
>
> if (file->f_op->release)
>
> line.
>
> Even if we implement the needed callbacks in hung_up_tty_fops,
> KCSAN will continue reporting that the value of filp->f_op changes.

That sounds like a bug in KCSAN, let's not add loads of infrastructure
just because we have bad tools.

> > And why would data_race() help here?
>
> data_race() tells KCSAN not to report.
> data_race() is used when the race KCSAN checks is harmless.

Again, document it, and also perhaps, not use KCSAN? :)

> >> @@ -182,7 +182,7 @@ int tty_alloc_file(struct file *file)
> >> {
> >> struct tty_file_private *priv;
> >>
> >> - priv = kmalloc(sizeof(*priv), GFP_KERNEL);
> >> + priv = kzalloc(sizeof(*priv), GFP_KERNEL);
> >
> > Why is this zeroing out everything now? Just because you added one
> > bool? Why not just set the bool properly instead?
>
> Because I consider that this function is not performance critical where
> avoid increasing code size by zeroing out everything is acceptable.

It happens on open() which yes, is not performance critical, but you are
now requiring it where before this was not required. Which isn't always
so obvious, right?

> >> @@ -911,6 +903,8 @@ static ssize_t tty_read(struct kiocb *iocb, struct iov_iter *to)
> >> struct tty_struct *tty = file_tty(file);
> >> struct tty_ldisc *ld;
> >>
> >> + if (tty_hung_up_p(file))
> >> + return hung_up_tty_read(iocb, to);
> >
> > What happens if you hang up _right_ after this check? There's no
> > locking here, right? Same everywhere else you have this pattern, you
> > made the race window smaller, but it's still there from what I can see.
>
> We cannot close the race window without introducing locking,
> but we don't need to close the race window.
>
> The race KCSAN found in this report is harmless, as long as callbacks
> reachable via filp->f_op does not disappear.

Which we can fix. So let's fix that and then not worry about these
false-positives with KCSAN as it's obviously wrong. That would make for
a much smaller and simpler and easier-to-maintain-over-time change.

Please do that instead.

> This patch prevents filp->f_op from suddenly disappearing callbacks,
> by not changing the value of filp->f_op.
>
>
>
> >> @@ -255,6 +255,7 @@ struct tty_file_private {
> >> struct tty_struct *tty;
> >> struct file *file;
> >> struct list_head list;
> >> + bool hung;
> >
> > No hint as to what "hung" means here?
>
> Whether __tty_hangup() was called or not.

How will you know this in 5 years when you see this new field?
Documentation matters.

thanks,

greg k-h