MIME-Version: 1.0
In-Reply-To: <CACVXFVP8zSFnAa7RW=L_G3PUr=HV=L6+L+w1oDiUWEjMddmxSA@mail.gmail.com>
References: <1430826595-5888-1-git-send-email-ming.lei@canonical.com>
	<1430826595-5888-3-git-send-email-ming.lei@canonical.com>
	<20150505135958.GO1971@htj.duckdns.org>
	<CACVXFVOKWdoQoZOnLyvES60Pray4Xwh5OH_PS7vLLyb51hikCw@mail.gmail.com>
	<20150505165541.GV1971@htj.duckdns.org>
	<CACVXFVP8zSFnAa7RW=L_G3PUr=HV=L6+L+w1oDiUWEjMddmxSA@mail.gmail.com>
Date: Wed, 6 May 2015 13:17:33 +0800
Message-ID: <CACVXFVNZRbgogP1M1G3GDdQTgQ5WUN+Tt4k4=5b2iJse3ADA4g@mail.gmail.com>
Subject: Re: [PATCH 2/2] block: loop: avoiding too many pending per work I/O
From: Ming Lei <ming.lei@canonical.com>
To: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>,
        Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
        "Justin M. Forbes" <jforbes@fedoraproject.org>,
        Jeff Moyer <jmoyer@redhat.com>, Christoph Hellwig <hch@infradead.org>,
        "v4.0" <stable@vger.kernel.org>
Content-Type: text/plain; charset=UTF-8
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2462
Lines: 49

On Wed, May 6, 2015 at 11:14 AM, Ming Lei <ming.lei@canonical.com> wrote:
> On Wed, May 6, 2015 at 12:55 AM, Tejun Heo <tj@kernel.org> wrote:
>> Hello, Ming.
>>
>> On Tue, May 05, 2015 at 10:46:10PM +0800, Ming Lei wrote:
>>> On Tue, May 5, 2015 at 9:59 PM, Tejun Heo <tj@kernel.org> wrote:
>>> > It's a bit weird to hard code this to 16 as this effectively becomes a
>>> > hidden bottleneck for concurrency.  For cases where 16 isn't a good
>>> > value, hunting down what's going on can be painful as it's not visible
>>> > anywhere.  I still think the right knob to control concurrency is
>>> > nr_requests for the loop device.  You said that for linear IOs, it's
>>> > better to have higher nr_requests than concurrency but can you
>>> > elaborate why?
>>>
>>> I mean, in case of sequential IO, the IO may hit page cache a bit easier,
>>> so handling the IO may be quite quick, then it is often more efficient to
>>> handle them in one same context(such as, handle one by one from IO
>>> queue) than from different contexts(scheduled from different worker
>>> threads). And that can be made by setting a bigger nr_requests(queue_depth).
>>
>> Ah, so, it's about the queueing latency.  Blocking the issuer from
>> get_request side for the same level of concurrency would incur a lot
>> longer latency before the next IO can be dispatched.  The arbitrary 16
>> is still bothering but for now it's fine I guess, but we need to
>> revisit the whole thing including WQ_HIGHPRI thing.  Maybe it made
>> sense when we had only one thread servicing all IOs but w/ high
>> concurrency I don't think it's a good idea.
>
> Yes, I was thinking about it too, but concurrency can improve
> random I/O throughput a lot in my tests.

Thinking of it further, the problem becomes very similar with
'Non-blocking buffered file read operations'[1] which was discussed
last year.  If the read IO can be predicted as buffered I/O, we
handle it in single thread, otherwise handle it concurrently,
and this approach should be more efficient if possible, I think.

But I still prefer to dio/aio approach because double cache can
be avoided, which is a big win in my previous tests.

[1], https://lwn.net/Articles/612483/

Thanks,
Ming Lei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/