Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753600AbbEFFRi (ORCPT ); Wed, 6 May 2015 01:17:38 -0400 Received: from youngberry.canonical.com ([91.189.89.112]:38845 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750776AbbEFFRg (ORCPT ); Wed, 6 May 2015 01:17:36 -0400 MIME-Version: 1.0 In-Reply-To: References: <1430826595-5888-1-git-send-email-ming.lei@canonical.com> <1430826595-5888-3-git-send-email-ming.lei@canonical.com> <20150505135958.GO1971@htj.duckdns.org> <20150505165541.GV1971@htj.duckdns.org> Date: Wed, 6 May 2015 13:17:33 +0800 Message-ID: Subject: Re: [PATCH 2/2] block: loop: avoiding too many pending per work I/O From: Ming Lei To: Tejun Heo Cc: Jens Axboe , Linux Kernel Mailing List , "Justin M. Forbes" , Jeff Moyer , Christoph Hellwig , "v4.0" Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2462 Lines: 49 On Wed, May 6, 2015 at 11:14 AM, Ming Lei wrote: > On Wed, May 6, 2015 at 12:55 AM, Tejun Heo wrote: >> Hello, Ming. >> >> On Tue, May 05, 2015 at 10:46:10PM +0800, Ming Lei wrote: >>> On Tue, May 5, 2015 at 9:59 PM, Tejun Heo wrote: >>> > It's a bit weird to hard code this to 16 as this effectively becomes a >>> > hidden bottleneck for concurrency. For cases where 16 isn't a good >>> > value, hunting down what's going on can be painful as it's not visible >>> > anywhere. I still think the right knob to control concurrency is >>> > nr_requests for the loop device. You said that for linear IOs, it's >>> > better to have higher nr_requests than concurrency but can you >>> > elaborate why? >>> >>> I mean, in case of sequential IO, the IO may hit page cache a bit easier, >>> so handling the IO may be quite quick, then it is often more efficient to >>> handle them in one same context(such as, handle one by one from IO >>> queue) than from different contexts(scheduled from different worker >>> threads). And that can be made by setting a bigger nr_requests(queue_depth). >> >> Ah, so, it's about the queueing latency. Blocking the issuer from >> get_request side for the same level of concurrency would incur a lot >> longer latency before the next IO can be dispatched. The arbitrary 16 >> is still bothering but for now it's fine I guess, but we need to >> revisit the whole thing including WQ_HIGHPRI thing. Maybe it made >> sense when we had only one thread servicing all IOs but w/ high >> concurrency I don't think it's a good idea. > > Yes, I was thinking about it too, but concurrency can improve > random I/O throughput a lot in my tests. Thinking of it further, the problem becomes very similar with 'Non-blocking buffered file read operations'[1] which was discussed last year. If the read IO can be predicted as buffered I/O, we handle it in single thread, otherwise handle it concurrently, and this approach should be more efficient if possible, I think. But I still prefer to dio/aio approach because double cache can be avoided, which is a big win in my previous tests. [1], https://lwn.net/Articles/612483/ Thanks, Ming Lei -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/