Hi,
The tests for DRBD encountered an issue with dm-delay when testing with
a 6.8 series kernel. Specifically on Ubuntu 24.04 "Noble". Here is a
minimal reproducer:
virter vm run --name dm-delay-test --id 10 --wait-ssh ubuntu-noble
virter vm ssh dm-delay-test
# truncate -s 100M /file
# loop_dev=$(losetup -f --show /file)
# echo "0 $(blockdev --getsz $loop_dev) delay $loop_dev 0 0" | dmsetup create delay-volume
After a few minutes, the following is printed to the kernel log:
[ 246.919123] INFO: task dm-delay-flush-:1256 blocked for more than 122 seconds.
[ 246.922543] Not tainted 6.8.0-31-generic #31-Ubuntu
[ 246.923753] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 246.924932] task:dm-delay-flush- state:D stack:0 pid:1256 tgid:1256 ppid:2 flags:0x00004000
[ 246.924940] Call Trace:
[ 246.924950] <TASK>
[ 246.924980] __schedule+0x27c/0x6b0
[ 246.925002] ? __pfx_flush_worker_fn+0x10/0x10 [dm_delay]
[ 246.925011] schedule+0x33/0x110
[ 246.925016] schedule_preempt_disabled+0x15/0x30
[ 246.925035] kthread+0xb1/0x120
..
This bug appears to have been introduced in Linux v6.7.
The following patch fixes the issue.
Thanks Christian and Benjamin for the comments on v1!
Changes from v1:
- Use kthread_run() instead of wake_up_process()
Joel Colledge (1):
dm-delay: fix hung task introduced by kthread mode
drivers/md/dm-delay.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
--
2.34.1
If the worker thread is not woken due to a bio, then it is not woken at
all. This causes the hung task check to trigger. This occurs, for
instance, when no bios are submitted. Also when a delay of 0 is
configured, delay_bio() returns without waking the worker.
Prevent the hung task check from triggering by creating the thread with
kthread_run() instead of using kthread_create() directly.
Fixes: 70bbeb29fab0 ("dm delay: for short delays, use kthread instead of timers and wq")
Signed-off-by: Joel Colledge <[email protected]>
---
drivers/md/dm-delay.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/md/dm-delay.c b/drivers/md/dm-delay.c
index 5eabdb06c649..eac166405b6b 100644
--- a/drivers/md/dm-delay.c
+++ b/drivers/md/dm-delay.c
@@ -267,8 +267,7 @@ static int delay_ctr(struct dm_target *ti, unsigned int argc, char **argv)
* In case of small requested delays, use kthread instead of
* timers and workqueue to achieve better latency.
*/
- dc->worker = kthread_create(&flush_worker_fn, dc,
- "dm-delay-flush-worker");
+ dc->worker = kthread_run(&flush_worker_fn, dc, "dm-delay-flush-worker");
if (IS_ERR(dc->worker)) {
ret = PTR_ERR(dc->worker);
dc->worker = NULL;
--
2.34.1
On Mon, May 06, 2024 at 09:25:23AM +0200, Joel Colledge wrote:
> If the worker thread is not woken due to a bio, then it is not woken at
> all. This causes the hung task check to trigger. This occurs, for
> instance, when no bios are submitted. Also when a delay of 0 is
> configured, delay_bio() returns without waking the worker.
>
> Prevent the hung task check from triggering by creating the thread with
> kthread_run() instead of using kthread_create() directly.
>
> Fixes: 70bbeb29fab0 ("dm delay: for short delays, use kthread instead of timers and wq")
> Signed-off-by: Joel Colledge <[email protected]>
Reviewed-by: Benjamin Marzinski <[email protected]>