Problem statement: if sync(2) races with bdi_wakeup_thread_delayed
(which is called when the first inode for a bdi is marked dirty), it's
possible that sync will be delayed for long (5 secs if dirty_writeback_interval
is set to default value).
How it works: sync schedules bdi work for immediate processing by calling
mod_delayed_work with 'delay' equal to 0. Bdi work is queued to pool->worklist
and wake_up_worker(pool) is called, but before worker gets the work from
the list, __mark_inode_dirty intervenes calling bdi_wakeup_thread_delayed
who calls mod_delayed_work with 'timeout' equal to dirty_writeback_interval
multiplied by 10. mod_delayed_work dives into try_to_grab_pending who
successfully steals the work from the worklist. Then it's re-queued with that
new delay. Until the timeout is lapsed, sync(2) sits on wait_for_completion in
sync_inodes_sb.
The patch uses queue_delayed_work for __mark_inode_dirty. This should be safe
because even if queue_delayed_work returns false (if the work is already on
a queue), bdi_writeback_workfn will re-schedule itself by looking at
wb->b_dirty.
Signed-off-by: Maxim Patlasov <[email protected]>
---
mm/backing-dev.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index ce682f7..3fde024 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -294,7 +294,7 @@ void bdi_wakeup_thread_delayed(struct backing_dev_info *bdi)
unsigned long timeout;
timeout = msecs_to_jiffies(dirty_writeback_interval * 10);
- mod_delayed_work(bdi_wq, &bdi->wb.dwork, timeout);
+ queue_delayed_work(bdi_wq, &bdi->wb.dwork, timeout);
}
/*
Hello,
On Fri, Sep 20, 2013 at 04:52:26PM +0400, Maxim Patlasov wrote:
> @@ -294,7 +294,7 @@ void bdi_wakeup_thread_delayed(struct backing_dev_info *bdi)
> unsigned long timeout;
>
> timeout = msecs_to_jiffies(dirty_writeback_interval * 10);
> - mod_delayed_work(bdi_wq, &bdi->wb.dwork, timeout);
> + queue_delayed_work(bdi_wq, &bdi->wb.dwork, timeout);
Hmmm... this at least requires comment explaining why
mod_delayed_work() doesn't work here. Also, I wonder whether what we
need is a function which either queues if !pending and shortens timer
if pending. This is a relatively common pattern and the suggested fix
is subtle and fragile.
Thanks.
--
tejun