2021-03-26 15:53:28

by Jens Axboe

[permalink] [raw]
Subject: [PATCHSET v2 0/7] Allow signals for IO threads

Hi,

For the v1 posting, see here:

https://lore.kernel.org/io-uring/[email protected]/

I've run this through the usual testing, and it's running long term right
now. I've tested the cases that Stefan reported, and we seem fine now.

Changes since v1:

- Catch fatal signals in get_signal() for PF_IO_WORKER. This is only a
problem for nested signals, like SIGSTOP followed by SIGKILL. We
can't have get_signal() calling do_exit() on behalf of the IO threads,
they have cleanups to do. Thanks Stefan.

- Move signal masking to when the PF_IO_WORKER thread is created, and since
we now handle SIGSTOP, unmask that as well. Thanks Oleg.

- Remove try_to_freeze() parts in IO threads, we don't need those anymore
with the calling of get_signal().

- Minor cleanups.

fs/io-wq.c | 24 +++++++++++++++++-------
fs/io_uring.c | 12 ++++++++----
kernel/fork.c | 16 ++++++++--------
kernel/freezer.c | 2 +-
kernel/ptrace.c | 2 +-
kernel/signal.c | 19 ++++++++++++-------
6 files changed, 47 insertions(+), 28 deletions(-)

--
Jens Axboe



2021-03-26 15:53:29

by Jens Axboe

[permalink] [raw]
Subject: [PATCH 1/7] kernel: don't call do_exit() for PF_IO_WORKER threads

Right now we're never calling get_signal() from PF_IO_WORKER threads, but
in preparation for doing so, don't handle a fatal signal for them. The
workers have state they need to cleanup when exiting, and they don't do
coredumps, so just return instead of performing either a dump or calling
do_exit() on their behalf. The threads themselves will detect a fatal
signal and do proper shutdown.

Signed-off-by: Jens Axboe <[email protected]>
---
kernel/signal.c | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/kernel/signal.c b/kernel/signal.c
index f2a1b898da29..e3e1b8fbfe8a 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2756,6 +2756,15 @@ bool get_signal(struct ksignal *ksig)
*/
current->flags |= PF_SIGNALED;

+ /*
+ * PF_IO_WORKER threads will catch and exit on fatal signals
+ * themselves. They have cleanup that must be performed, so
+ * we cannot call do_exit() on their behalf. coredumps also
+ * do not apply to them.
+ */
+ if (current->flags & PF_IO_WORKER)
+ return false;
+
if (sig_kernel_coredump(signr)) {
if (print_fatal_signals)
print_fatal_signal(ksig->info.si_signo);
--
2.31.0

2021-03-26 15:53:29

by Jens Axboe

[permalink] [raw]
Subject: [PATCH 2/7] io_uring: handle signals for IO threads like a normal thread

We go through various hoops to disallow signals for the IO threads, but
there's really no reason why we cannot just allow them. The IO threads
never return to userspace like a normal thread, and hence don't go through
normal signal processing. Instead, just check for a pending signal as part
of the work loop, and call get_signal() to handle it for us if anything
is pending.

With that, we can support receiving signals, including special ones like
SIGSTOP.

Signed-off-by: Jens Axboe <[email protected]>
---
fs/io-wq.c | 24 +++++++++++++++++-------
fs/io_uring.c | 12 ++++++++----
2 files changed, 25 insertions(+), 11 deletions(-)

diff --git a/fs/io-wq.c b/fs/io-wq.c
index b7c1fa932cb3..3e2f059a1737 100644
--- a/fs/io-wq.c
+++ b/fs/io-wq.c
@@ -16,7 +16,6 @@
#include <linux/rculist_nulls.h>
#include <linux/cpu.h>
#include <linux/tracehook.h>
-#include <linux/freezer.h>

#include "../kernel/sched/sched.h"
#include "io-wq.h"
@@ -503,10 +502,16 @@ static int io_wqe_worker(void *data)
if (io_flush_signals())
continue;
ret = schedule_timeout(WORKER_IDLE_TIMEOUT);
- if (try_to_freeze() || ret)
+ if (signal_pending(current)) {
+ struct ksignal ksig;
+
+ if (fatal_signal_pending(current))
+ break;
+ if (get_signal(&ksig))
+ continue;
+ }
+ if (ret)
continue;
- if (fatal_signal_pending(current))
- break;
/* timed out, exit unless we're the fixed worker */
if (test_bit(IO_WQ_BIT_EXIT, &wq->state) ||
!(worker->flags & IO_WORKER_F_FIXED))
@@ -714,9 +719,14 @@ static int io_wq_manager(void *data)
set_current_state(TASK_INTERRUPTIBLE);
io_wq_check_workers(wq);
schedule_timeout(HZ);
- try_to_freeze();
- if (fatal_signal_pending(current))
- set_bit(IO_WQ_BIT_EXIT, &wq->state);
+ if (signal_pending(current)) {
+ struct ksignal ksig;
+
+ if (fatal_signal_pending(current))
+ set_bit(IO_WQ_BIT_EXIT, &wq->state);
+ else if (get_signal(&ksig))
+ continue;
+ }
} while (!test_bit(IO_WQ_BIT_EXIT, &wq->state));

io_wq_check_workers(wq);
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 54ea561db4a5..350418a88db3 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -78,7 +78,6 @@
#include <linux/task_work.h>
#include <linux/pagemap.h>
#include <linux/io_uring.h>
-#include <linux/freezer.h>

#define CREATE_TRACE_POINTS
#include <trace/events/io_uring.h>
@@ -6765,8 +6764,14 @@ static int io_sq_thread(void *data)
timeout = jiffies + sqd->sq_thread_idle;
continue;
}
- if (fatal_signal_pending(current))
- break;
+ if (signal_pending(current)) {
+ struct ksignal ksig;
+
+ if (fatal_signal_pending(current))
+ break;
+ if (get_signal(&ksig))
+ continue;
+ }
sqt_spin = false;
cap_entries = !list_is_singular(&sqd->ctx_list);
list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) {
@@ -6809,7 +6814,6 @@ static int io_sq_thread(void *data)

mutex_unlock(&sqd->lock);
schedule();
- try_to_freeze();
mutex_lock(&sqd->lock);
list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
io_ring_clear_wakeup_flag(ctx);
--
2.31.0

2021-03-26 15:53:39

by Jens Axboe

[permalink] [raw]
Subject: [PATCH 03/10] Revert "signal: don't allow sending any signals to PF_IO_WORKER threads"

This reverts commit 5be28c8f85ce99ed2d329d2ad8bdd18ea19473a5.

IO threads now take signals just fine, so there's no reason to limit them
specifically. Revert the change that prevented that from happening.

Signed-off-by: Jens Axboe <[email protected]>
---
kernel/signal.c | 3 ---
1 file changed, 3 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index e3e1b8fbfe8a..af890479921a 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -834,9 +834,6 @@ static int check_kill_permission(int sig, struct kernel_siginfo *info,

if (!valid_signal(sig))
return -EINVAL;
- /* PF_IO_WORKER threads don't take any signals */
- if (t->flags & PF_IO_WORKER)
- return -ESRCH;

if (!si_fromuser(info))
return 0;
--
2.31.0

2021-03-26 15:54:00

by Jens Axboe

[permalink] [raw]
Subject: [PATCH 5/7] Revert "kernel: treat PF_IO_WORKER like PF_KTHREAD for ptrace/signals"

This reverts commit 6fb8f43cede0e4bd3ead847de78d531424a96be9.

The IO threads do allow signals now, including SIGSTOP, and we can allow
ptrace attach. Attaching won't reveal anything interesting for the IO
threads, but it will allow eg gdb to attach to a task with io_urings
and IO threads without complaining. And once attached, it will allow
the usual introspection into regular threads.

Signed-off-by: Jens Axboe <[email protected]>
---
kernel/ptrace.c | 2 +-
kernel/signal.c | 4 ++--
2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 821cf1723814..61db50f7ca86 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -375,7 +375,7 @@ static int ptrace_attach(struct task_struct *task, long request,
audit_ptrace(task);

retval = -EPERM;
- if (unlikely(task->flags & (PF_KTHREAD | PF_IO_WORKER)))
+ if (unlikely(task->flags & PF_KTHREAD))
goto out;
if (same_thread_group(task, current))
goto out;
diff --git a/kernel/signal.c b/kernel/signal.c
index af890479921a..76d85830d4fa 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -91,7 +91,7 @@ static bool sig_task_ignored(struct task_struct *t, int sig, bool force)
return true;

/* Only allow kernel generated signals to this kthread */
- if (unlikely((t->flags & (PF_KTHREAD | PF_IO_WORKER)) &&
+ if (unlikely((t->flags & PF_KTHREAD) &&
(handler == SIG_KTHREAD_KERNEL) && !force))
return true;

@@ -1097,7 +1097,7 @@ static int __send_signal(int sig, struct kernel_siginfo *info, struct task_struc
/*
* Skip useless siginfo allocation for SIGKILL and kernel threads.
*/
- if ((sig == SIGKILL) || (t->flags & (PF_KTHREAD | PF_IO_WORKER)))
+ if ((sig == SIGKILL) || (t->flags & PF_KTHREAD))
goto out_set;

/*
--
2.31.0

2021-03-26 15:54:00

by Jens Axboe

[permalink] [raw]
Subject: [PATCH 04/10] Revert "kernel: treat PF_IO_WORKER like PF_KTHREAD for ptrace/signals"

This reverts commit 6fb8f43cede0e4bd3ead847de78d531424a96be9.

The IO threads do allow signals now, including SIGSTOP, and we can allow
ptrace attach. Attaching won't reveal anything interesting for the IO
threads, but it will allow eg gdb to attach to a task with io_urings
and IO threads without complaining. And once attached, it will allow
the usual introspection into regular threads.

Signed-off-by: Jens Axboe <[email protected]>
---
kernel/ptrace.c | 2 +-
kernel/signal.c | 4 ++--
2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 821cf1723814..61db50f7ca86 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -375,7 +375,7 @@ static int ptrace_attach(struct task_struct *task, long request,
audit_ptrace(task);

retval = -EPERM;
- if (unlikely(task->flags & (PF_KTHREAD | PF_IO_WORKER)))
+ if (unlikely(task->flags & PF_KTHREAD))
goto out;
if (same_thread_group(task, current))
goto out;
diff --git a/kernel/signal.c b/kernel/signal.c
index af890479921a..76d85830d4fa 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -91,7 +91,7 @@ static bool sig_task_ignored(struct task_struct *t, int sig, bool force)
return true;

/* Only allow kernel generated signals to this kthread */
- if (unlikely((t->flags & (PF_KTHREAD | PF_IO_WORKER)) &&
+ if (unlikely((t->flags & PF_KTHREAD) &&
(handler == SIG_KTHREAD_KERNEL) && !force))
return true;

@@ -1097,7 +1097,7 @@ static int __send_signal(int sig, struct kernel_siginfo *info, struct task_struc
/*
* Skip useless siginfo allocation for SIGKILL and kernel threads.
*/
- if ((sig == SIGKILL) || (t->flags & (PF_KTHREAD | PF_IO_WORKER)))
+ if ((sig == SIGKILL) || (t->flags & PF_KTHREAD))
goto out_set;

/*
--
2.31.0

2021-03-26 15:54:00

by Jens Axboe

[permalink] [raw]
Subject: [PATCH 4/7] Revert "signal: don't allow sending any signals to PF_IO_WORKER threads"

This reverts commit 5be28c8f85ce99ed2d329d2ad8bdd18ea19473a5.

IO threads now take signals just fine, so there's no reason to limit them
specifically. Revert the change that prevented that from happening.

Signed-off-by: Jens Axboe <[email protected]>
---
kernel/signal.c | 3 ---
1 file changed, 3 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index e3e1b8fbfe8a..af890479921a 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -834,9 +834,6 @@ static int check_kill_permission(int sig, struct kernel_siginfo *info,

if (!valid_signal(sig))
return -EINVAL;
- /* PF_IO_WORKER threads don't take any signals */
- if (t->flags & PF_IO_WORKER)
- return -ESRCH;

if (!si_fromuser(info))
return 0;
--
2.31.0

2021-03-26 15:54:00

by Jens Axboe

[permalink] [raw]
Subject: [PATCH 05/10] Revert "kernel: freezer should treat PF_IO_WORKER like PF_KTHREAD for freezing"

This reverts commit 15b2219facadec583c24523eed40fa45865f859f.

Before IO threads accepted signals, the freezer using take signals to wake
up an IO thread would cause them to loop without any way to clear the
pending signal. That is no longer the case, so stop special casing
PF_IO_WORKER in the freezer.

Signed-off-by: Jens Axboe <[email protected]>
---
kernel/freezer.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/freezer.c b/kernel/freezer.c
index 1a2d57d1327c..dc520f01f99d 100644
--- a/kernel/freezer.c
+++ b/kernel/freezer.c
@@ -134,7 +134,7 @@ bool freeze_task(struct task_struct *p)
return false;
}

- if (!(p->flags & (PF_KTHREAD | PF_IO_WORKER)))
+ if (!(p->flags & PF_KTHREAD))
fake_signal_wake_up(p);
else
wake_up_state(p, TASK_INTERRUPTIBLE);
--
2.31.0

2021-03-26 15:54:00

by Jens Axboe

[permalink] [raw]
Subject: [PATCH 6/7] Revert "kernel: freezer should treat PF_IO_WORKER like PF_KTHREAD for freezing"

This reverts commit 15b2219facadec583c24523eed40fa45865f859f.

Before IO threads accepted signals, the freezer using take signals to wake
up an IO thread would cause them to loop without any way to clear the
pending signal. That is no longer the case, so stop special casing
PF_IO_WORKER in the freezer.

Signed-off-by: Jens Axboe <[email protected]>
---
kernel/freezer.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/freezer.c b/kernel/freezer.c
index 1a2d57d1327c..dc520f01f99d 100644
--- a/kernel/freezer.c
+++ b/kernel/freezer.c
@@ -134,7 +134,7 @@ bool freeze_task(struct task_struct *p)
return false;
}

- if (!(p->flags & (PF_KTHREAD | PF_IO_WORKER)))
+ if (!(p->flags & PF_KTHREAD))
fake_signal_wake_up(p);
else
wake_up_state(p, TASK_INTERRUPTIBLE);
--
2.31.0

2021-03-26 15:54:00

by Jens Axboe

[permalink] [raw]
Subject: [PATCH 07/10] io_uring: fix timeout cancel return code

From: Pavel Begunkov <[email protected]>

When we cancel a timeout we should emit a sensible return code, like
-ECANCELED but not 0, otherwise it may trick users.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/7b0ad1065e3bd1994722702bd0ba9e7bc9b0683b.1616696997.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>
---
fs/io_uring.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 350418a88db3..4d0cb2548a67 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1247,7 +1247,7 @@ static void io_queue_async_work(struct io_kiocb *req)
io_queue_linked_timeout(link);
}

-static void io_kill_timeout(struct io_kiocb *req)
+static void io_kill_timeout(struct io_kiocb *req, int status)
{
struct io_timeout_data *io = req->async_data;
int ret;
@@ -1257,7 +1257,7 @@ static void io_kill_timeout(struct io_kiocb *req)
atomic_set(&req->ctx->cq_timeouts,
atomic_read(&req->ctx->cq_timeouts) + 1);
list_del_init(&req->timeout.list);
- io_cqring_fill_event(req, 0);
+ io_cqring_fill_event(req, status);
io_put_req_deferred(req, 1);
}
}
@@ -1274,7 +1274,7 @@ static bool io_kill_timeouts(struct io_ring_ctx *ctx, struct task_struct *tsk,
spin_lock_irq(&ctx->completion_lock);
list_for_each_entry_safe(req, tmp, &ctx->timeout_list, timeout.list) {
if (io_match_task(req, tsk, files)) {
- io_kill_timeout(req);
+ io_kill_timeout(req, -ECANCELED);
canceled++;
}
}
@@ -1326,7 +1326,7 @@ static void io_flush_timeouts(struct io_ring_ctx *ctx)
break;

list_del_init(&req->timeout.list);
- io_kill_timeout(req);
+ io_kill_timeout(req, 0);
} while (!list_empty(&ctx->timeout_list));

ctx->cq_last_tm_flush = seq;
--
2.31.0

2021-03-26 15:54:02

by Jens Axboe

[permalink] [raw]
Subject: [PATCH 7/7] Revert "signal: don't allow STOP on PF_IO_WORKER threads"

This reverts commit 4db4b1a0d1779dc159f7b87feb97030ec0b12597.

The IO threads allow and handle SIGSTOP now, so don't special case them
anymore in task_set_jobctl_pending().

Signed-off-by: Jens Axboe <[email protected]>
---
kernel/signal.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index 76d85830d4fa..5b75fbe3d2d6 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -288,8 +288,7 @@ bool task_set_jobctl_pending(struct task_struct *task, unsigned long mask)
JOBCTL_STOP_SIGMASK | JOBCTL_TRAPPING));
BUG_ON((mask & JOBCTL_TRAPPING) && !(mask & JOBCTL_PENDING_MASK));

- if (unlikely(fatal_signal_pending(task) ||
- (task->flags & (PF_EXITING | PF_IO_WORKER))))
+ if (unlikely(fatal_signal_pending(task) || (task->flags & PF_EXITING)))
return false;

if (mask & JOBCTL_STOP_SIGMASK)
--
2.31.0

2021-03-26 15:54:02

by Jens Axboe

[permalink] [raw]
Subject: [PATCH 06/10] Revert "signal: don't allow STOP on PF_IO_WORKER threads"

This reverts commit 4db4b1a0d1779dc159f7b87feb97030ec0b12597.

The IO threads allow and handle SIGSTOP now, so don't special case them
anymore in task_set_jobctl_pending().

Signed-off-by: Jens Axboe <[email protected]>
---
kernel/signal.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index 76d85830d4fa..5b75fbe3d2d6 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -288,8 +288,7 @@ bool task_set_jobctl_pending(struct task_struct *task, unsigned long mask)
JOBCTL_STOP_SIGMASK | JOBCTL_TRAPPING));
BUG_ON((mask & JOBCTL_TRAPPING) && !(mask & JOBCTL_PENDING_MASK));

- if (unlikely(fatal_signal_pending(task) ||
- (task->flags & (PF_EXITING | PF_IO_WORKER))))
+ if (unlikely(fatal_signal_pending(task) || (task->flags & PF_EXITING)))
return false;

if (mask & JOBCTL_STOP_SIGMASK)
--
2.31.0

2021-03-26 15:54:09

by Jens Axboe

[permalink] [raw]
Subject: [PATCH 08/10] io_uring: do post-completion chore on t-out cancel

From: Pavel Begunkov <[email protected]>

Don't forget about io_commit_cqring() + io_cqring_ev_posted() after
exit/exec cancelling timeouts. Both functions declared only after
io_kill_timeouts(), so to avoid tons of forward declarations move
it down.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/72ace588772c0f14834a6a4185d56c445a366fb4.1616696997.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>
---
fs/io_uring.c | 42 ++++++++++++++++++++++--------------------
1 file changed, 22 insertions(+), 20 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 4d0cb2548a67..69896ae204d6 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1262,26 +1262,6 @@ static void io_kill_timeout(struct io_kiocb *req, int status)
}
}

-/*
- * Returns true if we found and killed one or more timeouts
- */
-static bool io_kill_timeouts(struct io_ring_ctx *ctx, struct task_struct *tsk,
- struct files_struct *files)
-{
- struct io_kiocb *req, *tmp;
- int canceled = 0;
-
- spin_lock_irq(&ctx->completion_lock);
- list_for_each_entry_safe(req, tmp, &ctx->timeout_list, timeout.list) {
- if (io_match_task(req, tsk, files)) {
- io_kill_timeout(req, -ECANCELED);
- canceled++;
- }
- }
- spin_unlock_irq(&ctx->completion_lock);
- return canceled != 0;
-}
-
static void __io_queue_deferred(struct io_ring_ctx *ctx)
{
do {
@@ -8612,6 +8592,28 @@ static void io_ring_exit_work(struct work_struct *work)
io_ring_ctx_free(ctx);
}

+/* Returns true if we found and killed one or more timeouts */
+static bool io_kill_timeouts(struct io_ring_ctx *ctx, struct task_struct *tsk,
+ struct files_struct *files)
+{
+ struct io_kiocb *req, *tmp;
+ int canceled = 0;
+
+ spin_lock_irq(&ctx->completion_lock);
+ list_for_each_entry_safe(req, tmp, &ctx->timeout_list, timeout.list) {
+ if (io_match_task(req, tsk, files)) {
+ io_kill_timeout(req, -ECANCELED);
+ canceled++;
+ }
+ }
+ io_commit_cqring(ctx);
+ spin_unlock_irq(&ctx->completion_lock);
+
+ if (canceled != 0)
+ io_cqring_ev_posted(ctx);
+ return canceled != 0;
+}
+
static void io_ring_ctx_wait_and_kill(struct io_ring_ctx *ctx)
{
unsigned long index;
--
2.31.0

2021-03-26 15:54:50

by Jens Axboe

[permalink] [raw]
Subject: [PATCH 3/7] kernel: stop masking signals in create_io_thread()

This is racy - move the blocking into when the task is created and
we're marking it as PF_IO_WORKER anyway. The IO threads are now
prepared to handle signals like SIGSTOP as well, so clear that from
the mask to allow proper stopping of IO threads.

Reported-by: Oleg Nesterov <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
---
kernel/fork.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index d3171e8e88e5..ddaa15227071 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1940,8 +1940,14 @@ static __latent_entropy struct task_struct *copy_process(
p = dup_task_struct(current, node);
if (!p)
goto fork_out;
- if (args->io_thread)
+ if (args->io_thread) {
+ /*
+ * Mark us an IO worker, and block any signal that isn't
+ * fatal or STOP
+ */
p->flags |= PF_IO_WORKER;
+ siginitsetinv(&p->blocked, sigmask(SIGKILL)|sigmask(SIGSTOP));
+ }

/*
* This _must_ happen before we call free_task(), i.e. before we jump
@@ -2430,14 +2436,8 @@ struct task_struct *create_io_thread(int (*fn)(void *), void *arg, int node)
.stack_size = (unsigned long)arg,
.io_thread = 1,
};
- struct task_struct *tsk;

- tsk = copy_process(NULL, 0, node, &args);
- if (!IS_ERR(tsk)) {
- sigfillset(&tsk->blocked);
- sigdelsetmask(&tsk->blocked, sigmask(SIGKILL));
- }
- return tsk;
+ return copy_process(NULL, 0, node, &args);
}

/*
--
2.31.0

2021-03-26 15:56:11

by Jens Axboe

[permalink] [raw]
Subject: [PATCH 10/10] io_uring: don't cancel extra on files match

From: Pavel Begunkov <[email protected]>

As tasks always wait and kill their io-wq on exec/exit, files are of no
more concern to us, so we don't need to specifically cancel them by hand
in those cases. Moreover we should not, because io_match_task() looks at
req->task->files now, which is always true and so leads to extra
cancellations, that wasn't a case before per-task io-wq.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/0566c1de9b9dd417f5de345c817ca953580e0e2e.1616696997.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>
---
fs/io_uring.c | 2 --
1 file changed, 2 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 4189e1b684e1..66ae46874d85 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1094,8 +1094,6 @@ static bool io_match_task(struct io_kiocb *head,
io_for_each_link(req, head) {
if (req->flags & REQ_F_INFLIGHT)
return true;
- if (req->task->files == files)
- return true;
}
return false;
}
--
2.31.0

2021-03-26 15:56:13

by Jens Axboe

[permalink] [raw]
Subject: [PATCH 09/10] io_uring: don't cancel-track common timeouts

From: Pavel Begunkov <[email protected]>

Don't account usual timeouts (i.e. not linked) as REQ_F_INFLIGHT but
keep behaviour prior to dd59a3d595cc1 ("io_uring: reliably cancel linked
timeouts").

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/104441ef5d97e3932113d44501fda0df88656b83.1616696997.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>
---
fs/io_uring.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 69896ae204d6..4189e1b684e1 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -5563,7 +5563,8 @@ static int io_timeout_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe,

data->mode = io_translate_timeout_mode(flags);
hrtimer_init(&data->timer, CLOCK_MONOTONIC, data->mode);
- io_req_track_inflight(req);
+ if (is_timeout_link)
+ io_req_track_inflight(req);
return 0;
}

--
2.31.0

2021-03-26 15:56:58

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCHSET v2 0/7] Allow signals for IO threads

On 3/26/21 9:51 AM, Jens Axboe wrote:
> Hi,
>
> For the v1 posting, see here:

Sigh, just ignore the last 4 patches (07...10/10) in this series,
there are sitting on top of this series and I messed up the git send-email.
This patch series ends in the 4 reverts.

--
Jens Axboe

2021-03-26 20:32:06

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH 2/7] io_uring: handle signals for IO threads like a normal thread

Jens Axboe <[email protected]> writes:

> We go through various hoops to disallow signals for the IO threads, but
> there's really no reason why we cannot just allow them. The IO threads
> never return to userspace like a normal thread, and hence don't go through
> normal signal processing. Instead, just check for a pending signal as part
> of the work loop, and call get_signal() to handle it for us if anything
> is pending.
>
> With that, we can support receiving signals, including special ones like
> SIGSTOP.
>
> Signed-off-by: Jens Axboe <[email protected]>
> ---
> fs/io-wq.c | 24 +++++++++++++++++-------
> fs/io_uring.c | 12 ++++++++----
> 2 files changed, 25 insertions(+), 11 deletions(-)
>
> diff --git a/fs/io-wq.c b/fs/io-wq.c
> index b7c1fa932cb3..3e2f059a1737 100644
> --- a/fs/io-wq.c
> +++ b/fs/io-wq.c
> @@ -16,7 +16,6 @@
> #include <linux/rculist_nulls.h>
> #include <linux/cpu.h>
> #include <linux/tracehook.h>
> -#include <linux/freezer.h>
>
> #include "../kernel/sched/sched.h"
> #include "io-wq.h"
> @@ -503,10 +502,16 @@ static int io_wqe_worker(void *data)
> if (io_flush_signals())
> continue;
> ret = schedule_timeout(WORKER_IDLE_TIMEOUT);
> - if (try_to_freeze() || ret)
> + if (signal_pending(current)) {
> + struct ksignal ksig;
> +
> + if (fatal_signal_pending(current))
> + break;
> + if (get_signal(&ksig))
> + continue;
^^^^^^^^^^^^^^^^^^^^^^

That is wrong. You are promising to deliver a signal to signal
handler and them simply discarding it. Perhaps:

if (!get_signal(&ksig))
continue;
WARN_ON(!sig_kernel_stop(ksig->sig));
break;


Eric

2021-03-26 20:51:10

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH 1/7] kernel: don't call do_exit() for PF_IO_WORKER threads

Jens Axboe <[email protected]> writes:

> Right now we're never calling get_signal() from PF_IO_WORKER threads, but
> in preparation for doing so, don't handle a fatal signal for them. The
> workers have state they need to cleanup when exiting, and they don't do
> coredumps, so just return instead of performing either a dump or calling
> do_exit() on their behalf. The threads themselves will detect a fatal
> signal and do proper shutdown.
>
> Signed-off-by: Jens Axboe <[email protected]>
> ---
> kernel/signal.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/kernel/signal.c b/kernel/signal.c
> index f2a1b898da29..e3e1b8fbfe8a 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -2756,6 +2756,15 @@ bool get_signal(struct ksignal *ksig)
> */
> current->flags |= PF_SIGNALED;
>
> + /*
> + * PF_IO_WORKER threads will catch and exit on fatal signals
> + * themselves. They have cleanup that must be performed, so
> + * we cannot call do_exit() on their behalf. coredumps also
> + * do not apply to them.
> + */
> + if (current->flags & PF_IO_WORKER)
> + return false;
> +

Returning false when get_signal needs the caller to handle a signal
adds a very weird and awkward special case to how get_signal returns
arguments.

Instead you should simply break and let get_signal return SIGKILL like
any other signal that has a handler that the caller of get_signal needs
to handle.

Something like:
> + /*
> + * PF_IO_WORKER have cleanup that must be performed,
> + * before calling do_exit().
> + */
> + if (current->flags & PF_IO_WORKER)
> + break;


As do_coredump does not call do_exit there is no reason to skip calling into
the coredump handling either. And allowing it will remove yet another
special case from the io worker code.

> if (sig_kernel_coredump(signr)) {
> if (print_fatal_signals)
> print_fatal_signal(ksig->info.si_signo);

Eric

2021-03-26 20:51:46

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH 3/7] kernel: stop masking signals in create_io_thread()

Jens Axboe <[email protected]> writes:

> This is racy - move the blocking into when the task is created and
> we're marking it as PF_IO_WORKER anyway. The IO threads are now
> prepared to handle signals like SIGSTOP as well, so clear that from
> the mask to allow proper stopping of IO threads.

Acked-by: "Eric W. Biederman" <[email protected]>

>
> Reported-by: Oleg Nesterov <[email protected]>
> Signed-off-by: Jens Axboe <[email protected]>
> ---
> kernel/fork.c | 16 ++++++++--------
> 1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/kernel/fork.c b/kernel/fork.c
> index d3171e8e88e5..ddaa15227071 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -1940,8 +1940,14 @@ static __latent_entropy struct task_struct *copy_process(
> p = dup_task_struct(current, node);
> if (!p)
> goto fork_out;
> - if (args->io_thread)
> + if (args->io_thread) {
> + /*
> + * Mark us an IO worker, and block any signal that isn't
> + * fatal or STOP
> + */
> p->flags |= PF_IO_WORKER;
> + siginitsetinv(&p->blocked, sigmask(SIGKILL)|sigmask(SIGSTOP));
> + }
>
> /*
> * This _must_ happen before we call free_task(), i.e. before we jump
> @@ -2430,14 +2436,8 @@ struct task_struct *create_io_thread(int (*fn)(void *), void *arg, int node)
> .stack_size = (unsigned long)arg,
> .io_thread = 1,
> };
> - struct task_struct *tsk;
>
> - tsk = copy_process(NULL, 0, node, &args);
> - if (!IS_ERR(tsk)) {
> - sigfillset(&tsk->blocked);
> - sigdelsetmask(&tsk->blocked, sigmask(SIGKILL));
> - }
> - return tsk;
> + return copy_process(NULL, 0, node, &args);
> }
>
> /*

2021-03-26 22:14:15

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH 1/7] kernel: don't call do_exit() for PF_IO_WORKER threads

On 3/26/21 2:43 PM, Eric W. Biederman wrote:
> Jens Axboe <[email protected]> writes:
>
>> Right now we're never calling get_signal() from PF_IO_WORKER threads, but
>> in preparation for doing so, don't handle a fatal signal for them. The
>> workers have state they need to cleanup when exiting, and they don't do
>> coredumps, so just return instead of performing either a dump or calling
>> do_exit() on their behalf. The threads themselves will detect a fatal
>> signal and do proper shutdown.
>>
>> Signed-off-by: Jens Axboe <[email protected]>
>> ---
>> kernel/signal.c | 9 +++++++++
>> 1 file changed, 9 insertions(+)
>>
>> diff --git a/kernel/signal.c b/kernel/signal.c
>> index f2a1b898da29..e3e1b8fbfe8a 100644
>> --- a/kernel/signal.c
>> +++ b/kernel/signal.c
>> @@ -2756,6 +2756,15 @@ bool get_signal(struct ksignal *ksig)
>> */
>> current->flags |= PF_SIGNALED;
>>
>> + /*
>> + * PF_IO_WORKER threads will catch and exit on fatal signals
>> + * themselves. They have cleanup that must be performed, so
>> + * we cannot call do_exit() on their behalf. coredumps also
>> + * do not apply to them.
>> + */
>> + if (current->flags & PF_IO_WORKER)
>> + return false;
>> +
>
> Returning false when get_signal needs the caller to handle a signal
> adds a very weird and awkward special case to how get_signal returns
> arguments.
>
> Instead you should simply break and let get_signal return SIGKILL like
> any other signal that has a handler that the caller of get_signal needs
> to handle.
>
> Something like:
>> + /*
>> + * PF_IO_WORKER have cleanup that must be performed,
>> + * before calling do_exit().
>> + */
>> + if (current->flags & PF_IO_WORKER)
>> + break;
>
>
> As do_coredump does not call do_exit there is no reason to skip calling into
> the coredump handling either. And allowing it will remove yet another
> special case from the io worker code.

Thanks, I'll turn it into a break, that does seem like a better idea in
general. Actually it wants to be a goto or similar, as a break will
assume that we have the sighand lock held. With the coredump being
irrelevant, I'll just it before the do_exit() call.

--
Jens Axboe

2021-03-26 22:17:57

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH 2/7] io_uring: handle signals for IO threads like a normal thread

On 3/26/21 2:29 PM, Eric W. Biederman wrote:
> Jens Axboe <[email protected]> writes:
>
>> We go through various hoops to disallow signals for the IO threads, but
>> there's really no reason why we cannot just allow them. The IO threads
>> never return to userspace like a normal thread, and hence don't go through
>> normal signal processing. Instead, just check for a pending signal as part
>> of the work loop, and call get_signal() to handle it for us if anything
>> is pending.
>>
>> With that, we can support receiving signals, including special ones like
>> SIGSTOP.
>>
>> Signed-off-by: Jens Axboe <[email protected]>
>> ---
>> fs/io-wq.c | 24 +++++++++++++++++-------
>> fs/io_uring.c | 12 ++++++++----
>> 2 files changed, 25 insertions(+), 11 deletions(-)
>>
>> diff --git a/fs/io-wq.c b/fs/io-wq.c
>> index b7c1fa932cb3..3e2f059a1737 100644
>> --- a/fs/io-wq.c
>> +++ b/fs/io-wq.c
>> @@ -16,7 +16,6 @@
>> #include <linux/rculist_nulls.h>
>> #include <linux/cpu.h>
>> #include <linux/tracehook.h>
>> -#include <linux/freezer.h>
>>
>> #include "../kernel/sched/sched.h"
>> #include "io-wq.h"
>> @@ -503,10 +502,16 @@ static int io_wqe_worker(void *data)
>> if (io_flush_signals())
>> continue;
>> ret = schedule_timeout(WORKER_IDLE_TIMEOUT);
>> - if (try_to_freeze() || ret)
>> + if (signal_pending(current)) {
>> + struct ksignal ksig;
>> +
>> + if (fatal_signal_pending(current))
>> + break;
>> + if (get_signal(&ksig))
>> + continue;
> ^^^^^^^^^^^^^^^^^^^^^^
>
> That is wrong. You are promising to deliver a signal to signal
> handler and them simply discarding it. Perhaps:
>
> if (!get_signal(&ksig))
> continue;
> WARN_ON(!sig_kernel_stop(ksig->sig));
> break;

Thanks, updated.

--
Jens Axboe

2021-03-26 22:26:27

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH 2/7] io_uring: handle signals for IO threads like a normal thread

Jens Axboe <[email protected]> writes:

> On 3/26/21 2:29 PM, Eric W. Biederman wrote:
>> Jens Axboe <[email protected]> writes:
>>
>>> We go through various hoops to disallow signals for the IO threads, but
>>> there's really no reason why we cannot just allow them. The IO threads
>>> never return to userspace like a normal thread, and hence don't go through
>>> normal signal processing. Instead, just check for a pending signal as part
>>> of the work loop, and call get_signal() to handle it for us if anything
>>> is pending.
>>>
>>> With that, we can support receiving signals, including special ones like
>>> SIGSTOP.
>>>
>>> Signed-off-by: Jens Axboe <[email protected]>
>>> ---
>>> fs/io-wq.c | 24 +++++++++++++++++-------
>>> fs/io_uring.c | 12 ++++++++----
>>> 2 files changed, 25 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/fs/io-wq.c b/fs/io-wq.c
>>> index b7c1fa932cb3..3e2f059a1737 100644
>>> --- a/fs/io-wq.c
>>> +++ b/fs/io-wq.c
>>> @@ -16,7 +16,6 @@
>>> #include <linux/rculist_nulls.h>
>>> #include <linux/cpu.h>
>>> #include <linux/tracehook.h>
>>> -#include <linux/freezer.h>
>>>
>>> #include "../kernel/sched/sched.h"
>>> #include "io-wq.h"
>>> @@ -503,10 +502,16 @@ static int io_wqe_worker(void *data)
>>> if (io_flush_signals())
>>> continue;
>>> ret = schedule_timeout(WORKER_IDLE_TIMEOUT);
>>> - if (try_to_freeze() || ret)
>>> + if (signal_pending(current)) {
>>> + struct ksignal ksig;
>>> +
>>> + if (fatal_signal_pending(current))
>>> + break;
>>> + if (get_signal(&ksig))
>>> + continue;
>> ^^^^^^^^^^^^^^^^^^^^^^
>>
>> That is wrong. You are promising to deliver a signal to signal
>> handler and them simply discarding it. Perhaps:
>>
>> if (!get_signal(&ksig))
>> continue;
>> WARN_ON(!sig_kernel_stop(ksig->sig));
>> break;
>
> Thanks, updated.

Gah. Kill the WARN_ON.

I was thinking "WARN_ON(!sig_kernel_fatal(ksig->sig));"
The function sig_kernel_fatal does not exist.

Fatal is the state that is left when a signal is neither
ignored nor a stop signal, and does not have a handler.

The rest of the logic still works.

Eric

2021-03-26 22:34:49

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH 2/7] io_uring: handle signals for IO threads like a normal thread

On 3/26/21 4:23 PM, Eric W. Biederman wrote:
> Jens Axboe <[email protected]> writes:
>
>> On 3/26/21 2:29 PM, Eric W. Biederman wrote:
>>> Jens Axboe <[email protected]> writes:
>>>
>>>> We go through various hoops to disallow signals for the IO threads, but
>>>> there's really no reason why we cannot just allow them. The IO threads
>>>> never return to userspace like a normal thread, and hence don't go through
>>>> normal signal processing. Instead, just check for a pending signal as part
>>>> of the work loop, and call get_signal() to handle it for us if anything
>>>> is pending.
>>>>
>>>> With that, we can support receiving signals, including special ones like
>>>> SIGSTOP.
>>>>
>>>> Signed-off-by: Jens Axboe <[email protected]>
>>>> ---
>>>> fs/io-wq.c | 24 +++++++++++++++++-------
>>>> fs/io_uring.c | 12 ++++++++----
>>>> 2 files changed, 25 insertions(+), 11 deletions(-)
>>>>
>>>> diff --git a/fs/io-wq.c b/fs/io-wq.c
>>>> index b7c1fa932cb3..3e2f059a1737 100644
>>>> --- a/fs/io-wq.c
>>>> +++ b/fs/io-wq.c
>>>> @@ -16,7 +16,6 @@
>>>> #include <linux/rculist_nulls.h>
>>>> #include <linux/cpu.h>
>>>> #include <linux/tracehook.h>
>>>> -#include <linux/freezer.h>
>>>>
>>>> #include "../kernel/sched/sched.h"
>>>> #include "io-wq.h"
>>>> @@ -503,10 +502,16 @@ static int io_wqe_worker(void *data)
>>>> if (io_flush_signals())
>>>> continue;
>>>> ret = schedule_timeout(WORKER_IDLE_TIMEOUT);
>>>> - if (try_to_freeze() || ret)
>>>> + if (signal_pending(current)) {
>>>> + struct ksignal ksig;
>>>> +
>>>> + if (fatal_signal_pending(current))
>>>> + break;
>>>> + if (get_signal(&ksig))
>>>> + continue;
>>> ^^^^^^^^^^^^^^^^^^^^^^
>>>
>>> That is wrong. You are promising to deliver a signal to signal
>>> handler and them simply discarding it. Perhaps:
>>>
>>> if (!get_signal(&ksig))
>>> continue;
>>> WARN_ON(!sig_kernel_stop(ksig->sig));
>>> break;
>>
>> Thanks, updated.
>
> Gah. Kill the WARN_ON.
>
> I was thinking "WARN_ON(!sig_kernel_fatal(ksig->sig));"
> The function sig_kernel_fatal does not exist.
>
> Fatal is the state that is left when a signal is neither
> ignored nor a stop signal, and does not have a handler.
>
> The rest of the logic still works.

I've just come to the same conclusion myself after testing it.
Of the 3 cases, most of them can do the continue, but doesn't
really matter with the way the loop is structured. Anyway, looks
like this now:


commit 769186e30cd437f5e1a000e7cf00286948779da4
Author: Jens Axboe <[email protected]>
Date: Thu Mar 25 18:16:06 2021 -0600

io_uring: handle signals for IO threads like a normal thread

We go through various hoops to disallow signals for the IO threads, but
there's really no reason why we cannot just allow them. The IO threads
never return to userspace like a normal thread, and hence don't go through
normal signal processing. Instead, just check for a pending signal as part
of the work loop, and call get_signal() to handle it for us if anything
is pending.

With that, we can support receiving signals, including special ones like
SIGSTOP.

Signed-off-by: Jens Axboe <[email protected]>

diff --git a/fs/io-wq.c b/fs/io-wq.c
index b7c1fa932cb3..7e5970c8b0be 100644
--- a/fs/io-wq.c
+++ b/fs/io-wq.c
@@ -16,7 +16,6 @@
#include <linux/rculist_nulls.h>
#include <linux/cpu.h>
#include <linux/tracehook.h>
-#include <linux/freezer.h>

#include "../kernel/sched/sched.h"
#include "io-wq.h"
@@ -503,10 +502,16 @@ static int io_wqe_worker(void *data)
if (io_flush_signals())
continue;
ret = schedule_timeout(WORKER_IDLE_TIMEOUT);
- if (try_to_freeze() || ret)
+ if (signal_pending(current)) {
+ struct ksignal ksig;
+
+ if (fatal_signal_pending(current))
+ break;
+ if (!get_signal(&ksig))
+ continue;
+ }
+ if (ret)
continue;
- if (fatal_signal_pending(current))
- break;
/* timed out, exit unless we're the fixed worker */
if (test_bit(IO_WQ_BIT_EXIT, &wq->state) ||
!(worker->flags & IO_WORKER_F_FIXED))
@@ -714,9 +719,15 @@ static int io_wq_manager(void *data)
set_current_state(TASK_INTERRUPTIBLE);
io_wq_check_workers(wq);
schedule_timeout(HZ);
- try_to_freeze();
- if (fatal_signal_pending(current))
- set_bit(IO_WQ_BIT_EXIT, &wq->state);
+ if (signal_pending(current)) {
+ struct ksignal ksig;
+
+ if (fatal_signal_pending(current)) {
+ set_bit(IO_WQ_BIT_EXIT, &wq->state);
+ continue;
+ }
+ get_signal(&ksig);
+ }
} while (!test_bit(IO_WQ_BIT_EXIT, &wq->state));

io_wq_check_workers(wq);
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 54ea561db4a5..1c64e3f9b7a2 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -78,7 +78,6 @@
#include <linux/task_work.h>
#include <linux/pagemap.h>
#include <linux/io_uring.h>
-#include <linux/freezer.h>

#define CREATE_TRACE_POINTS
#include <trace/events/io_uring.h>
@@ -6765,8 +6764,14 @@ static int io_sq_thread(void *data)
timeout = jiffies + sqd->sq_thread_idle;
continue;
}
- if (fatal_signal_pending(current))
- break;
+ if (signal_pending(current)) {
+ struct ksignal ksig;
+
+ if (fatal_signal_pending(current))
+ break;
+ if (!get_signal(&ksig))
+ continue;
+ }
sqt_spin = false;
cap_entries = !list_is_singular(&sqd->ctx_list);
list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) {
@@ -6809,7 +6814,6 @@ static int io_sq_thread(void *data)

mutex_unlock(&sqd->lock);
schedule();
- try_to_freeze();
mutex_lock(&sqd->lock);
list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
io_ring_clear_wakeup_flag(ctx);

--
Jens Axboe

2021-03-26 22:38:38

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH 2/7] io_uring: handle signals for IO threads like a normal thread

Jens Axboe <[email protected]> writes:

> On 3/26/21 4:23 PM, Eric W. Biederman wrote:
>> Jens Axboe <[email protected]> writes:
>>
>>> On 3/26/21 2:29 PM, Eric W. Biederman wrote:
>>>> Jens Axboe <[email protected]> writes:
>>>>
>>>>> We go through various hoops to disallow signals for the IO threads, but
>>>>> there's really no reason why we cannot just allow them. The IO threads
>>>>> never return to userspace like a normal thread, and hence don't go through
>>>>> normal signal processing. Instead, just check for a pending signal as part
>>>>> of the work loop, and call get_signal() to handle it for us if anything
>>>>> is pending.
>>>>>
>>>>> With that, we can support receiving signals, including special ones like
>>>>> SIGSTOP.
>>>>>
>>>>> Signed-off-by: Jens Axboe <[email protected]>
>>>>> ---
>>>>> fs/io-wq.c | 24 +++++++++++++++++-------
>>>>> fs/io_uring.c | 12 ++++++++----
>>>>> 2 files changed, 25 insertions(+), 11 deletions(-)
>>>>>
>>>>> diff --git a/fs/io-wq.c b/fs/io-wq.c
>>>>> index b7c1fa932cb3..3e2f059a1737 100644
>>>>> --- a/fs/io-wq.c
>>>>> +++ b/fs/io-wq.c
>>>>> @@ -16,7 +16,6 @@
>>>>> #include <linux/rculist_nulls.h>
>>>>> #include <linux/cpu.h>
>>>>> #include <linux/tracehook.h>
>>>>> -#include <linux/freezer.h>
>>>>>
>>>>> #include "../kernel/sched/sched.h"
>>>>> #include "io-wq.h"
>>>>> @@ -503,10 +502,16 @@ static int io_wqe_worker(void *data)
>>>>> if (io_flush_signals())
>>>>> continue;
>>>>> ret = schedule_timeout(WORKER_IDLE_TIMEOUT);
>>>>> - if (try_to_freeze() || ret)
>>>>> + if (signal_pending(current)) {
>>>>> + struct ksignal ksig;
>>>>> +
>>>>> + if (fatal_signal_pending(current))
>>>>> + break;
>>>>> + if (get_signal(&ksig))
>>>>> + continue;
>>>> ^^^^^^^^^^^^^^^^^^^^^^
>>>>
>>>> That is wrong. You are promising to deliver a signal to signal
>>>> handler and them simply discarding it. Perhaps:
>>>>
>>>> if (!get_signal(&ksig))
>>>> continue;
>>>> WARN_ON(!sig_kernel_stop(ksig->sig));
>>>> break;
>>>
>>> Thanks, updated.
>>
>> Gah. Kill the WARN_ON.
>>
>> I was thinking "WARN_ON(!sig_kernel_fatal(ksig->sig));"
>> The function sig_kernel_fatal does not exist.
>>
>> Fatal is the state that is left when a signal is neither
>> ignored nor a stop signal, and does not have a handler.
>>
>> The rest of the logic still works.
>
> I've just come to the same conclusion myself after testing it.
> Of the 3 cases, most of them can do the continue, but doesn't
> really matter with the way the loop is structured. Anyway, looks
> like this now:

This idiom in the code:
> + if (signal_pending(current)) {
> + struct ksignal ksig;
> +
> + if (fatal_signal_pending(current))
> + break;
> + if (!get_signal(&ksig))
> + continue;
> }

Needs to be:
> + if (signal_pending(current)) {
> + struct ksignal ksig;
> +
> + if (!get_signal(&ksig))
> + continue;
> + break;
> }

Because any signal returned from get_signal is fatal in this case.
It might make sense to "WARN_ON(ksig->ka.sa.sa_handler != SIG_DFL)".
As the io workers don't handle that case.

It won't happen because you have everything blocked.

The extra fatal_signal_pending(current) logic is just confusing in this
case.

Eric

2021-03-26 22:41:47

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH 2/7] io_uring: handle signals for IO threads like a normal thread

On 3/26/21 4:35 PM, Eric W. Biederman wrote:
> Jens Axboe <[email protected]> writes:
>
>> On 3/26/21 4:23 PM, Eric W. Biederman wrote:
>>> Jens Axboe <[email protected]> writes:
>>>
>>>> On 3/26/21 2:29 PM, Eric W. Biederman wrote:
>>>>> Jens Axboe <[email protected]> writes:
>>>>>
>>>>>> We go through various hoops to disallow signals for the IO threads, but
>>>>>> there's really no reason why we cannot just allow them. The IO threads
>>>>>> never return to userspace like a normal thread, and hence don't go through
>>>>>> normal signal processing. Instead, just check for a pending signal as part
>>>>>> of the work loop, and call get_signal() to handle it for us if anything
>>>>>> is pending.
>>>>>>
>>>>>> With that, we can support receiving signals, including special ones like
>>>>>> SIGSTOP.
>>>>>>
>>>>>> Signed-off-by: Jens Axboe <[email protected]>
>>>>>> ---
>>>>>> fs/io-wq.c | 24 +++++++++++++++++-------
>>>>>> fs/io_uring.c | 12 ++++++++----
>>>>>> 2 files changed, 25 insertions(+), 11 deletions(-)
>>>>>>
>>>>>> diff --git a/fs/io-wq.c b/fs/io-wq.c
>>>>>> index b7c1fa932cb3..3e2f059a1737 100644
>>>>>> --- a/fs/io-wq.c
>>>>>> +++ b/fs/io-wq.c
>>>>>> @@ -16,7 +16,6 @@
>>>>>> #include <linux/rculist_nulls.h>
>>>>>> #include <linux/cpu.h>
>>>>>> #include <linux/tracehook.h>
>>>>>> -#include <linux/freezer.h>
>>>>>>
>>>>>> #include "../kernel/sched/sched.h"
>>>>>> #include "io-wq.h"
>>>>>> @@ -503,10 +502,16 @@ static int io_wqe_worker(void *data)
>>>>>> if (io_flush_signals())
>>>>>> continue;
>>>>>> ret = schedule_timeout(WORKER_IDLE_TIMEOUT);
>>>>>> - if (try_to_freeze() || ret)
>>>>>> + if (signal_pending(current)) {
>>>>>> + struct ksignal ksig;
>>>>>> +
>>>>>> + if (fatal_signal_pending(current))
>>>>>> + break;
>>>>>> + if (get_signal(&ksig))
>>>>>> + continue;
>>>>> ^^^^^^^^^^^^^^^^^^^^^^
>>>>>
>>>>> That is wrong. You are promising to deliver a signal to signal
>>>>> handler and them simply discarding it. Perhaps:
>>>>>
>>>>> if (!get_signal(&ksig))
>>>>> continue;
>>>>> WARN_ON(!sig_kernel_stop(ksig->sig));
>>>>> break;
>>>>
>>>> Thanks, updated.
>>>
>>> Gah. Kill the WARN_ON.
>>>
>>> I was thinking "WARN_ON(!sig_kernel_fatal(ksig->sig));"
>>> The function sig_kernel_fatal does not exist.
>>>
>>> Fatal is the state that is left when a signal is neither
>>> ignored nor a stop signal, and does not have a handler.
>>>
>>> The rest of the logic still works.
>>
>> I've just come to the same conclusion myself after testing it.
>> Of the 3 cases, most of them can do the continue, but doesn't
>> really matter with the way the loop is structured. Anyway, looks
>> like this now:
>
> This idiom in the code:
>> + if (signal_pending(current)) {
>> + struct ksignal ksig;
>> +
>> + if (fatal_signal_pending(current))
>> + break;
>> + if (!get_signal(&ksig))
>> + continue;
>> }
>
> Needs to be:
>> + if (signal_pending(current)) {
>> + struct ksignal ksig;
>> +
>> + if (!get_signal(&ksig))
>> + continue;
>> + break;
>> }
>
> Because any signal returned from get_signal is fatal in this case.
> It might make sense to "WARN_ON(ksig->ka.sa.sa_handler != SIG_DFL)".
> As the io workers don't handle that case.
>
> It won't happen because you have everything blocked.
>
> The extra fatal_signal_pending(current) logic is just confusing in this
> case.

OK good point, and follows the same logic even if it won't make a
difference in my case. I'll make the change.

--
Jens Axboe

2021-03-26 22:51:43

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH 2/7] io_uring: handle signals for IO threads like a normal thread

On 3/26/21 4:38 PM, Jens Axboe wrote:
> On 3/26/21 4:35 PM, Eric W. Biederman wrote:
>> Jens Axboe <[email protected]> writes:
>>
>>> On 3/26/21 4:23 PM, Eric W. Biederman wrote:
>>>> Jens Axboe <[email protected]> writes:
>>>>
>>>>> On 3/26/21 2:29 PM, Eric W. Biederman wrote:
>>>>>> Jens Axboe <[email protected]> writes:
>>>>>>
>>>>>>> We go through various hoops to disallow signals for the IO threads, but
>>>>>>> there's really no reason why we cannot just allow them. The IO threads
>>>>>>> never return to userspace like a normal thread, and hence don't go through
>>>>>>> normal signal processing. Instead, just check for a pending signal as part
>>>>>>> of the work loop, and call get_signal() to handle it for us if anything
>>>>>>> is pending.
>>>>>>>
>>>>>>> With that, we can support receiving signals, including special ones like
>>>>>>> SIGSTOP.
>>>>>>>
>>>>>>> Signed-off-by: Jens Axboe <[email protected]>
>>>>>>> ---
>>>>>>> fs/io-wq.c | 24 +++++++++++++++++-------
>>>>>>> fs/io_uring.c | 12 ++++++++----
>>>>>>> 2 files changed, 25 insertions(+), 11 deletions(-)
>>>>>>>
>>>>>>> diff --git a/fs/io-wq.c b/fs/io-wq.c
>>>>>>> index b7c1fa932cb3..3e2f059a1737 100644
>>>>>>> --- a/fs/io-wq.c
>>>>>>> +++ b/fs/io-wq.c
>>>>>>> @@ -16,7 +16,6 @@
>>>>>>> #include <linux/rculist_nulls.h>
>>>>>>> #include <linux/cpu.h>
>>>>>>> #include <linux/tracehook.h>
>>>>>>> -#include <linux/freezer.h>
>>>>>>>
>>>>>>> #include "../kernel/sched/sched.h"
>>>>>>> #include "io-wq.h"
>>>>>>> @@ -503,10 +502,16 @@ static int io_wqe_worker(void *data)
>>>>>>> if (io_flush_signals())
>>>>>>> continue;
>>>>>>> ret = schedule_timeout(WORKER_IDLE_TIMEOUT);
>>>>>>> - if (try_to_freeze() || ret)
>>>>>>> + if (signal_pending(current)) {
>>>>>>> + struct ksignal ksig;
>>>>>>> +
>>>>>>> + if (fatal_signal_pending(current))
>>>>>>> + break;
>>>>>>> + if (get_signal(&ksig))
>>>>>>> + continue;
>>>>>> ^^^^^^^^^^^^^^^^^^^^^^
>>>>>>
>>>>>> That is wrong. You are promising to deliver a signal to signal
>>>>>> handler and them simply discarding it. Perhaps:
>>>>>>
>>>>>> if (!get_signal(&ksig))
>>>>>> continue;
>>>>>> WARN_ON(!sig_kernel_stop(ksig->sig));
>>>>>> break;
>>>>>
>>>>> Thanks, updated.
>>>>
>>>> Gah. Kill the WARN_ON.
>>>>
>>>> I was thinking "WARN_ON(!sig_kernel_fatal(ksig->sig));"
>>>> The function sig_kernel_fatal does not exist.
>>>>
>>>> Fatal is the state that is left when a signal is neither
>>>> ignored nor a stop signal, and does not have a handler.
>>>>
>>>> The rest of the logic still works.
>>>
>>> I've just come to the same conclusion myself after testing it.
>>> Of the 3 cases, most of them can do the continue, but doesn't
>>> really matter with the way the loop is structured. Anyway, looks
>>> like this now:
>>
>> This idiom in the code:
>>> + if (signal_pending(current)) {
>>> + struct ksignal ksig;
>>> +
>>> + if (fatal_signal_pending(current))
>>> + break;
>>> + if (!get_signal(&ksig))
>>> + continue;
>>> }
>>
>> Needs to be:
>>> + if (signal_pending(current)) {
>>> + struct ksignal ksig;
>>> +
>>> + if (!get_signal(&ksig))
>>> + continue;
>>> + break;
>>> }
>>
>> Because any signal returned from get_signal is fatal in this case.
>> It might make sense to "WARN_ON(ksig->ka.sa.sa_handler != SIG_DFL)".
>> As the io workers don't handle that case.
>>
>> It won't happen because you have everything blocked.
>>
>> The extra fatal_signal_pending(current) logic is just confusing in this
>> case.
>
> OK good point, and follows the same logic even if it won't make a
> difference in my case. I'll make the change.

Made the suggested edits and ran the quick tests and the KILL/STOP
testing, and no ill effects observed. Kicked off the longer runs now.

Not a huge amount of changes from the posted series, but please peruse
here if you want to double check:

https://git.kernel.dk/cgit/linux-block/log/?h=io_uring-5.12

And diff against v2 posted is below. Thanks!

diff --git a/fs/io-wq.c b/fs/io-wq.c
index 3e2f059a1737..7434eb40ca8c 100644
--- a/fs/io-wq.c
+++ b/fs/io-wq.c
@@ -505,10 +505,9 @@ static int io_wqe_worker(void *data)
if (signal_pending(current)) {
struct ksignal ksig;

- if (fatal_signal_pending(current))
- break;
- if (get_signal(&ksig))
+ if (!get_signal(&ksig))
continue;
+ break;
}
if (ret)
continue;
@@ -722,10 +721,9 @@ static int io_wq_manager(void *data)
if (signal_pending(current)) {
struct ksignal ksig;

- if (fatal_signal_pending(current))
- set_bit(IO_WQ_BIT_EXIT, &wq->state);
- else if (get_signal(&ksig))
+ if (!get_signal(&ksig))
continue;
+ set_bit(IO_WQ_BIT_EXIT, &wq->state);
}
} while (!test_bit(IO_WQ_BIT_EXIT, &wq->state));

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 66ae46874d85..880abd8b6d31 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -6746,10 +6746,9 @@ static int io_sq_thread(void *data)
if (signal_pending(current)) {
struct ksignal ksig;

- if (fatal_signal_pending(current))
- break;
- if (get_signal(&ksig))
+ if (!get_signal(&ksig))
continue;
+ break;
}
sqt_spin = false;
cap_entries = !list_is_singular(&sqd->ctx_list);
diff --git a/kernel/signal.c b/kernel/signal.c
index 5b75fbe3d2d6..f2718350bf4b 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2752,15 +2752,6 @@ bool get_signal(struct ksignal *ksig)
*/
current->flags |= PF_SIGNALED;

- /*
- * PF_IO_WORKER threads will catch and exit on fatal signals
- * themselves. They have cleanup that must be performed, so
- * we cannot call do_exit() on their behalf. coredumps also
- * do not apply to them.
- */
- if (current->flags & PF_IO_WORKER)
- return false;
-
if (sig_kernel_coredump(signr)) {
if (print_fatal_signals)
print_fatal_signal(ksig->info.si_signo);
@@ -2776,6 +2767,14 @@ bool get_signal(struct ksignal *ksig)
do_coredump(&ksig->info);
}

+ /*
+ * PF_IO_WORKER threads will catch and exit on fatal signals
+ * themselves. They have cleanup that must be performed, so
+ * we cannot call do_exit() on their behalf.
+ */
+ if (current->flags & PF_IO_WORKER)
+ goto out;
+
/*
* Death signals, no core dump.
*/
@@ -2783,7 +2782,7 @@ bool get_signal(struct ksignal *ksig)
/* NOTREACHED */
}
spin_unlock_irq(&sighand->siglock);
-
+out:
ksig->sig = signr;

if (!(ksig->ka.sa.sa_flags & SA_EXPOSE_TAGBITS))

--
Jens Axboe

2021-03-27 17:44:15

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH 2/7] io_uring: handle signals for IO threads like a normal thread

Jens Axboe <[email protected]> writes:

> On 3/26/21 4:38 PM, Jens Axboe wrote:
>> OK good point, and follows the same logic even if it won't make a
>> difference in my case. I'll make the change.
>
> Made the suggested edits and ran the quick tests and the KILL/STOP
> testing, and no ill effects observed. Kicked off the longer runs now.
>
> Not a huge amount of changes from the posted series, but please peruse
> here if you want to double check:
>
> https://git.kernel.dk/cgit/linux-block/log/?h=io_uring-5.12
>
> And diff against v2 posted is below. Thanks!

That looks good. Thanks.

Acked-by: "Eric W. Biederman" <[email protected]>

>
> diff --git a/fs/io-wq.c b/fs/io-wq.c
> index 3e2f059a1737..7434eb40ca8c 100644
> --- a/fs/io-wq.c
> +++ b/fs/io-wq.c
> @@ -505,10 +505,9 @@ static int io_wqe_worker(void *data)
> if (signal_pending(current)) {
> struct ksignal ksig;
>
> - if (fatal_signal_pending(current))
> - break;
> - if (get_signal(&ksig))
> + if (!get_signal(&ksig))
> continue;
> + break;
> }
> if (ret)
> continue;
> @@ -722,10 +721,9 @@ static int io_wq_manager(void *data)
> if (signal_pending(current)) {
> struct ksignal ksig;
>
> - if (fatal_signal_pending(current))
> - set_bit(IO_WQ_BIT_EXIT, &wq->state);
> - else if (get_signal(&ksig))
> + if (!get_signal(&ksig))
> continue;
> + set_bit(IO_WQ_BIT_EXIT, &wq->state);
> }
> } while (!test_bit(IO_WQ_BIT_EXIT, &wq->state));
>
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 66ae46874d85..880abd8b6d31 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -6746,10 +6746,9 @@ static int io_sq_thread(void *data)
> if (signal_pending(current)) {
> struct ksignal ksig;
>
> - if (fatal_signal_pending(current))
> - break;
> - if (get_signal(&ksig))
> + if (!get_signal(&ksig))
> continue;
> + break;
> }
> sqt_spin = false;
> cap_entries = !list_is_singular(&sqd->ctx_list);
> diff --git a/kernel/signal.c b/kernel/signal.c
> index 5b75fbe3d2d6..f2718350bf4b 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -2752,15 +2752,6 @@ bool get_signal(struct ksignal *ksig)
> */
> current->flags |= PF_SIGNALED;
>
> - /*
> - * PF_IO_WORKER threads will catch and exit on fatal signals
> - * themselves. They have cleanup that must be performed, so
> - * we cannot call do_exit() on their behalf. coredumps also
> - * do not apply to them.
> - */
> - if (current->flags & PF_IO_WORKER)
> - return false;
> -
> if (sig_kernel_coredump(signr)) {
> if (print_fatal_signals)
> print_fatal_signal(ksig->info.si_signo);
> @@ -2776,6 +2767,14 @@ bool get_signal(struct ksignal *ksig)
> do_coredump(&ksig->info);
> }
>
> + /*
> + * PF_IO_WORKER threads will catch and exit on fatal signals
> + * themselves. They have cleanup that must be performed, so
> + * we cannot call do_exit() on their behalf.
> + */
> + if (current->flags & PF_IO_WORKER)
> + goto out;
> +
> /*
> * Death signals, no core dump.
> */
> @@ -2783,7 +2782,7 @@ bool get_signal(struct ksignal *ksig)
> /* NOTREACHED */
> }
> spin_unlock_irq(&sighand->siglock);
> -
> +out:
> ksig->sig = signr;
>
> if (!(ksig->ka.sa.sa_flags & SA_EXPOSE_TAGBITS))

2021-03-27 20:10:53

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH 2/7] io_uring: handle signals for IO threads like a normal thread

On 3/27/21 11:40 AM, Eric W. Biederman wrote:
> Jens Axboe <[email protected]> writes:
>
>> On 3/26/21 4:38 PM, Jens Axboe wrote:
>>> OK good point, and follows the same logic even if it won't make a
>>> difference in my case. I'll make the change.
>>
>> Made the suggested edits and ran the quick tests and the KILL/STOP
>> testing, and no ill effects observed. Kicked off the longer runs now.
>>
>> Not a huge amount of changes from the posted series, but please peruse
>> here if you want to double check:
>>
>> https://git.kernel.dk/cgit/linux-block/log/?h=io_uring-5.12
>>
>> And diff against v2 posted is below. Thanks!
>
> That looks good. Thanks.
>
> Acked-by: "Eric W. Biederman" <[email protected]>

Thanks Eric, amended to add that.

--
Jens Axboe