2021-03-10 02:04:20

by Kevin Locke

[permalink] [raw]
Subject: [v5.12-rc2 regression] io_uring: high CPU use after suspend-to-ram

With kernel 5.12-rc2 (and torvalds/master 144c79ef3353), if mpd is
playing or paused when my system is suspended-to-ram, when the system is
resumed mpd will consume ~200% CPU until killed. It continues to
produce audio and respond to pause/play commands, which do not affect
CPU usage. This occurs with either pulse (to PulseAudio or
PipeWire-as-PulseAudio) or alsa audio_output.

The issue appears to have been introduced by a combination of two
commits: 3bfe6106693b caused freeze on suspend-to-ram when mpd is paused
or playing. e4b4a13f4941 fixed suspend-to-ram, but introduced the high
CPU on resume.

I attempted to further diagnose using `perf record -p $(pidof mpd)`.
Running for about a minute after resume shows ~280 MMAP2 events and
almost nothing else. I'm not sure what to make of that or how to
further investigate.

Let me know if there's anything else I can do to help diagnose/test.

Thanks,
Kevin


2021-03-10 02:21:58

by Jens Axboe

[permalink] [raw]
Subject: Re: [v5.12-rc2 regression] io_uring: high CPU use after suspend-to-ram

On 3/9/21 6:55 PM, Kevin Locke wrote:
> With kernel 5.12-rc2 (and torvalds/master 144c79ef3353), if mpd is
> playing or paused when my system is suspended-to-ram, when the system is
> resumed mpd will consume ~200% CPU until killed. It continues to
> produce audio and respond to pause/play commands, which do not affect
> CPU usage. This occurs with either pulse (to PulseAudio or
> PipeWire-as-PulseAudio) or alsa audio_output.
>
> The issue appears to have been introduced by a combination of two
> commits: 3bfe6106693b caused freeze on suspend-to-ram when mpd is paused
> or playing. e4b4a13f4941 fixed suspend-to-ram, but introduced the high
> CPU on resume.
>
> I attempted to further diagnose using `perf record -p $(pidof mpd)`.
> Running for about a minute after resume shows ~280 MMAP2 events and
> almost nothing else. I'm not sure what to make of that or how to
> further investigate.
>
> Let me know if there's anything else I can do to help diagnose/test.

Thanks for the report, let me take a look and try and reproduce (and
fix) it. I'll let you know if I fail in reproducing and need your
help in testing a fix!

--
Jens Axboe

2021-03-10 02:51:03

by Jens Axboe

[permalink] [raw]
Subject: Re: [v5.12-rc2 regression] io_uring: high CPU use after suspend-to-ram

On 3/9/21 6:55 PM, Kevin Locke wrote:
> With kernel 5.12-rc2 (and torvalds/master 144c79ef3353), if mpd is
> playing or paused when my system is suspended-to-ram, when the system is
> resumed mpd will consume ~200% CPU until killed. It continues to
> produce audio and respond to pause/play commands, which do not affect
> CPU usage. This occurs with either pulse (to PulseAudio or
> PipeWire-as-PulseAudio) or alsa audio_output.
>
> The issue appears to have been introduced by a combination of two
> commits: 3bfe6106693b caused freeze on suspend-to-ram when mpd is paused
> or playing. e4b4a13f4941 fixed suspend-to-ram, but introduced the high
> CPU on resume.
>
> I attempted to further diagnose using `perf record -p $(pidof mpd)`.
> Running for about a minute after resume shows ~280 MMAP2 events and
> almost nothing else. I'm not sure what to make of that or how to
> further investigate.
>
> Let me know if there's anything else I can do to help diagnose/test.

The below makes it work as expected for me - but I don't quite
understand why we're continually running after the freeze. Adding Rafael
to help understand this.

Rafael, what appears to happen here from a quick look is that the io
threads are frozen fine and the system suspends. But when we resume,
signal_pending() is perpetually true, and that is why we then see the
io_wq_manager() thread just looping like crazy. Is there anything
special I need to do? Note that these are not kthreads, PF_KTHREAD is
not true. I'm guessing it may have something to do with that, but
haven't dug deeper yet.


diff --git a/fs/io-wq.c b/fs/io-wq.c
index 3d7060ba547a..0ae9ecadf295 100644
--- a/fs/io-wq.c
+++ b/fs/io-wq.c
@@ -591,7 +591,7 @@ static bool create_io_worker(struct io_wq *wq, struct io_wqe *wqe, int index)
tsk->pf_io_worker = worker;
worker->task = tsk;
set_cpus_allowed_ptr(tsk, cpumask_of_node(wqe->node));
- tsk->flags |= PF_NOFREEZE | PF_NO_SETAFFINITY;
+ tsk->flags |= PF_NO_SETAFFINITY;

raw_spin_lock_irq(&wqe->lock);
hlist_nulls_add_head_rcu(&worker->nulls_node, &wqe->free_list);
@@ -709,7 +709,6 @@ static int io_wq_manager(void *data)
set_current_state(TASK_INTERRUPTIBLE);
io_wq_check_workers(wq);
schedule_timeout(HZ);
- try_to_freeze();
if (fatal_signal_pending(current))
set_bit(IO_WQ_BIT_EXIT, &wq->state);
} while (!test_bit(IO_WQ_BIT_EXIT, &wq->state));
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 280133f3abc4..8f4128eb4aa2 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -6735,7 +6735,6 @@ static int io_sq_thread(void *data)

up_read(&sqd->rw_lock);
schedule();
- try_to_freeze();
down_read(&sqd->rw_lock);
list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
io_ring_clear_wakeup_flag(ctx);
diff --git a/kernel/fork.c b/kernel/fork.c
index d3171e8e88e5..72e444cd0ffe 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2436,6 +2436,7 @@ struct task_struct *create_io_thread(int (*fn)(void *), void *arg, int node)
if (!IS_ERR(tsk)) {
sigfillset(&tsk->blocked);
sigdelsetmask(&tsk->blocked, sigmask(SIGKILL));
+ tsk->flags |= PF_NOFREEZE;
}
return tsk;
}

--
Jens Axboe

2021-03-10 03:25:22

by Kevin Locke

[permalink] [raw]
Subject: Re: [v5.12-rc2 regression] io_uring: high CPU use after suspend-to-ram

On Tue, 2021-03-09 at 19:48 -0700, Jens Axboe wrote:
> On 3/9/21 6:55 PM, Kevin Locke wrote:
>> With kernel 5.12-rc2 (and torvalds/master 144c79ef3353), if mpd is
>> playing or paused when my system is suspended-to-ram, when the system is
>> resumed mpd will consume ~200% CPU until killed. It continues to
>> produce audio and respond to pause/play commands, which do not affect
>> CPU usage. This occurs with either pulse (to PulseAudio or
>> PipeWire-as-PulseAudio) or alsa audio_output.
>
> The below makes it work as expected for me - but I don't quite
> understand why we're continually running after the freeze. Adding Rafael
> to help understand this.

I can confirm that your patch resolves the high CPU usage after suspend
on my system as well. Many thanks!

Tested-by: Kevin Locke <[email protected]>

Happy to test any future revisions as well.

Thanks again,
Kevin

2021-03-10 14:48:57

by Jens Axboe

[permalink] [raw]
Subject: Re: [v5.12-rc2 regression] io_uring: high CPU use after suspend-to-ram

On 3/9/21 8:23 PM, Kevin Locke wrote:
> On Tue, 2021-03-09 at 19:48 -0700, Jens Axboe wrote:
>> On 3/9/21 6:55 PM, Kevin Locke wrote:
>>> With kernel 5.12-rc2 (and torvalds/master 144c79ef3353), if mpd is
>>> playing or paused when my system is suspended-to-ram, when the system is
>>> resumed mpd will consume ~200% CPU until killed. It continues to
>>> produce audio and respond to pause/play commands, which do not affect
>>> CPU usage. This occurs with either pulse (to PulseAudio or
>>> PipeWire-as-PulseAudio) or alsa audio_output.
>>
>> The below makes it work as expected for me - but I don't quite
>> understand why we're continually running after the freeze. Adding Rafael
>> to help understand this.
>
> I can confirm that your patch resolves the high CPU usage after suspend
> on my system as well. Many thanks!
>
> Tested-by: Kevin Locke <[email protected]>
>
> Happy to test any future revisions as well.

Thanks, I'll just hold on to this version for now. It's how it would've
worked before the thread rework anyway. I'd still like to understand why
the thaw leaves them spinning, though :-). But once that is understood,
we can potentially just enable freezing again as a separate patch.
Fixing this one is more important for the time being.

--
Jens Axboe