2024-02-09 07:10:25

by Kent Overstreet

[permalink] [raw]
Subject: [PATCH] kernel/hung_task.c: export sysctl_hung_task_timeout_secs

needed for thread_with_file; also rare but not unheard of to need this
in module code, when blocking on user input.

one workaround used by some code is wait_event_interruptible() - but
that can be buggy if the outer context isn't expecting unwinding.

Signed-off-by: Kent Overstreet <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: fuyuanli <[email protected]>
---
kernel/hung_task.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index 9a24574988d2..b2fc2727d654 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -43,6 +43,7 @@ static int __read_mostly sysctl_hung_task_check_count = PID_MAX_LIMIT;
* Zero means infinite timeout - no checking done:
*/
unsigned long __read_mostly sysctl_hung_task_timeout_secs = CONFIG_DEFAULT_HUNG_TASK_TIMEOUT;
+EXPORT_SYMBOL_GPL(sysctl_hung_task_timeout_secs);

/*
* Zero (default value) means use sysctl_hung_task_timeout_secs:
--
2.43.0



2024-02-09 22:13:32

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] kernel/hung_task.c: export sysctl_hung_task_timeout_secs

On Fri, 9 Feb 2024 02:09:35 -0500 Kent Overstreet <[email protected]> wrote:

> needed for thread_with_file; also rare but not unheard of to need this
> in module code, when blocking on user input.

I see no bcachefs code in linux-next which uses this. All I have to go
with is the above explanation-free assertion. IOW this patch is
unreviewable.

> one workaround used by some code is wait_event_interruptible()

examples?

> - but that can be buggy if the outer context isn't expecting unwinding.

More explanation of this?

> --- a/kernel/hung_task.c
> +++ b/kernel/hung_task.c
> @@ -43,6 +43,7 @@ static int __read_mostly sysctl_hung_task_check_count = PID_MAX_LIMIT;
> * Zero means infinite timeout - no checking done:
> */
> unsigned long __read_mostly sysctl_hung_task_timeout_secs = CONFIG_DEFAULT_HUNG_TASK_TIMEOUT;
> +EXPORT_SYMBOL_GPL(sysctl_hung_task_timeout_secs);

It seems strange that a module wouild want this. Makes one wonder what
the heck is going on in there.


2024-02-09 22:23:46

by Kent Overstreet

[permalink] [raw]
Subject: Re: [PATCH] kernel/hung_task.c: export sysctl_hung_task_timeout_secs

On Fri, Feb 09, 2024 at 02:13:24PM -0800, Andrew Morton wrote:
> On Fri, 9 Feb 2024 02:09:35 -0500 Kent Overstreet <[email protected]> wrote:
>
> > needed for thread_with_file; also rare but not unheard of to need this
> > in module code, when blocking on user input.
>
> I see no bcachefs code in linux-next which uses this. All I have to go
> with is the above explanation-free assertion. IOW this patch is
> unreviewable.
>
> > one workaround used by some code is wait_event_interruptible()
>
> examples?

fs/bcachefs/util.h kthread_wait_event(); we use that - among other
things - when the kthread is parked waiting for userspace to flip it on.

TASK_INTERRUPTIBLE was the suggestion I got years ago, but I want to get
away from it because -

>
> > - but that can be buggy if the outer context isn't expecting unwinding.
>
> More explanation of this?

We're starting to think about this a bit more because of David Howell's
proposal; the idea is that perhaps TASK_UNINTERRUPTIBLE vs.
TASK_INTERURPTIBLE vs. TASK_KILLABLE should probably not be set at the
waiting context, it should be set at the outer context where we would
handle (or not handle) -ERESTARTSYS.

think mutex_lock() vs. mutex_lock_killable(); that is bubbling up the
context specification in an ad hoc way. This would regularize that.

I've also seen bugs where code was doing a fixed TASK_INTERRUPTIBLE and
the outer context wasn't expecting that - kthread creation does this.

>
> > --- a/kernel/hung_task.c
> > +++ b/kernel/hung_task.c
> > @@ -43,6 +43,7 @@ static int __read_mostly sysctl_hung_task_check_count = PID_MAX_LIMIT;
> > * Zero means infinite timeout - no checking done:
> > */
> > unsigned long __read_mostly sysctl_hung_task_timeout_secs = CONFIG_DEFAULT_HUNG_TASK_TIMEOUT;
> > +EXPORT_SYMBOL_GPL(sysctl_hung_task_timeout_secs);
>
> It seems strange that a module wouild want this. Makes one wonder what
> the heck is going on in there.

specifically, this is for thread_with_file, where we've got a kthread
hooked up to a file descriptor, effectively using it as both stdin and
stdout.

When the kthread reads from the fd, that can block for an unbounded
amount of time - we're waiting on userspace input and it's totally fine.

2024-02-15 08:33:19

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] kernel/hung_task.c: export sysctl_hung_task_timeout_secs

On Fri, Feb 09, 2024 at 02:09:35AM -0500, Kent Overstreet wrote:
> needed for thread_with_file; also rare but not unheard of to need this
> in module code, when blocking on user input.
>
> one workaround used by some code is wait_event_interruptible() - but
> that can be buggy if the outer context isn't expecting unwinding.

I don't think just exporting the variable ad thus allowing write
access is a good idea. If we want to keep going down the route of
this hack we should add an accessor function that returns the value.

The cleaner solution would be a new task state that explicitly
marks code than can sleep forever without triggerring the hang
check. Although this might be a bit invaѕive and take a while.


2024-02-15 18:55:18

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] kernel/hung_task.c: export sysctl_hung_task_timeout_secs

On Wed, 14 Feb 2024 21:26:34 -0800 Christoph Hellwig <[email protected]> wrote:

> On Fri, Feb 09, 2024 at 02:09:35AM -0500, Kent Overstreet wrote:
> > needed for thread_with_file; also rare but not unheard of to need this
> > in module code, when blocking on user input.
> >
> > one workaround used by some code is wait_event_interruptible() - but
> > that can be buggy if the outer context isn't expecting unwinding.
>
> I don't think just exporting the variable ad thus allowing write
> access is a good idea. If we want to keep going down the route of
> this hack we should add an accessor function that returns the value.
>
> The cleaner solution would be a new task state that explicitly
> marks code than can sleep forever without triggerring the hang
> check. Although this might be a bit invaѕive and take a while.

A new PF_whatever flag would solve that simply?

Which are the potential use sites for such a thing?

2024-02-15 23:27:18

by Kent Overstreet

[permalink] [raw]
Subject: Re: [PATCH] kernel/hung_task.c: export sysctl_hung_task_timeout_secs

On Thu, Feb 15, 2024 at 10:55:09AM -0800, Andrew Morton wrote:
> On Wed, 14 Feb 2024 21:26:34 -0800 Christoph Hellwig <[email protected]> wrote:
>
> > On Fri, Feb 09, 2024 at 02:09:35AM -0500, Kent Overstreet wrote:
> > > needed for thread_with_file; also rare but not unheard of to need this
> > > in module code, when blocking on user input.
> > >
> > > one workaround used by some code is wait_event_interruptible() - but
> > > that can be buggy if the outer context isn't expecting unwinding.
> >
> > I don't think just exporting the variable ad thus allowing write
> > access is a good idea. If we want to keep going down the route of
> > this hack we should add an accessor function that returns the value.
> >
> > The cleaner solution would be a new task state that explicitly
> > marks code than can sleep forever without triggerring the hang
> > check. Although this might be a bit invaѕive and take a while.

I had the same thought.

> A new PF_whatever flag would solve that simply?

TASK_* flags are separate from PF_* flags, fortunately, and it doesn't
look like anything but TASK_* flags go in task_struct->__state, so
this shouldn't be a difficult change.

> Which are the potential use sites for such a thing?

There's a few places in the block layer that are using the sysctl value;
those will be easy to fix. There's definitely more places abusing
TASK_INTERRUPTIBLE, but aside from the ones in my code I can't think of
a way to search for them.

But the block layer ones look a little suspect to me: the commit message
indicates they were added becasue discards on some devices can take >
100 seconds - which is true, but this is a more general problem, there's
other places we block on IO.

Might want to give this some more thought.

2024-02-16 07:11:32

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] kernel/hung_task.c: export sysctl_hung_task_timeout_secs

On Thu, Feb 15, 2024 at 06:26:59PM -0500, Kent Overstreet wrote:
> There's a few places in the block layer that are using the sysctl value;
> those will be easy to fix. There's definitely more places abusing
> TASK_INTERRUPTIBLE, but aside from the ones in my code I can't think of
> a way to search for them.

I think any kthread that is woken using wake_up_process or a wait queue
is a good candidate.