From: Spencer Baugh <sbaugh@catern.com>
To: Jens Axboe <axboe@kernel.dk>, linux-kernel@vger.kernel.org (open list)
Cc: Joern Engel <joern@purestorage.com>,
        Spencer Baugh <Spencer.baugh@purestorage.com>,
        Joern Engel <joern@logfs.org>, Spencer Baugh <sbaugh@catern.com>
Subject: [PATCH] block: round timeouts to 100ms instead of 1s
Date: Thu, 23 Jul 2015 14:54:35 -0700
Message-Id: <1437688476-3399-5-git-send-email-sbaugh@catern.com>
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 3021
Lines: 83

From: Joern Engel <joern@logfs.org>

Users can request timeouts as low as 1s.  However, whatever the request
timeout happens to be, we always round it up by up to 1s.  So at the
lower end the rounding doubles the user-requested timeout.

Reduce the impact of rounding for small timeout values.

Curious side note: The staggering done in round_jiffies_common() has the
effect of firing timers at slightly different times on different cpus.
The intended result seems to be that not all cpus handle timers at the
same time.

However, this trick only works if the timeout calculation and the firing
of the timer happen on the same cpu.  For block queues the effect is
that instead of bunching timers to trigger just once per second, they
trigger about once per second _per cpu_.  Or rather they used to before
this patch.  So on reasonably-sized systems the timers can actually
trigger less frequently, in spite of better precision.

Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Spencer Baugh <sbaugh@catern.com>
---
 block/blk-timeout.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/block/blk-timeout.c b/block/blk-timeout.c
index 246dfb1..0d06162 100644
--- a/block/blk-timeout.c
+++ b/block/blk-timeout.c
@@ -127,6 +127,17 @@ static void blk_rq_check_expired(struct request *rq, unsigned long *next_timeout
 	}
 }
 
+/*
+ * With SSDs it gets realistic to set a short timeout of 1s.  But if
+ * every timeout gets rounded up by as much as a second, the effective
+ * limit is 2s.  Round jiffies a bit more precisely to about 100ms
+ * instead.
+ */
+static unsigned long round_jiffies_up_100ms(unsigned long j)
+{
+	return round_up(j, rounddown_pow_of_two(HZ / 10));
+}
+
 void blk_rq_timed_out_timer(unsigned long data)
 {
 	struct request_queue *q = (struct request_queue *) data;
@@ -140,7 +151,7 @@ void blk_rq_timed_out_timer(unsigned long data)
 		blk_rq_check_expired(rq, &next, &next_set);
 
 	if (next_set)
-		mod_timer(&q->timeout, round_jiffies_up(next));
+		mod_timer(&q->timeout, round_jiffies_up_100ms(next));
 
 	spin_unlock_irqrestore(q->queue_lock, flags);
 }
@@ -170,7 +181,7 @@ unsigned long blk_rq_timeout(unsigned long timeout)
 {
 	unsigned long maxt;
 
-	maxt = round_jiffies_up(jiffies + BLK_MAX_TIMEOUT);
+	maxt = round_jiffies_up_100ms(jiffies + BLK_MAX_TIMEOUT);
 	if (time_after(timeout, maxt))
 		timeout = maxt;
 
@@ -215,7 +226,7 @@ void blk_add_timer(struct request *req)
 	 * than an existing one, modify the timer. Round up to next nearest
 	 * second.
 	 */
-	expiry = blk_rq_timeout(round_jiffies_up(req->deadline));
+	expiry = blk_rq_timeout(round_jiffies_up_100ms(req->deadline));
 
 	if (!timer_pending(&q->timeout) ||
 	    time_before(expiry, q->timeout.expires)) {
-- 
2.5.0.rc3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/