Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965739AbcJ0W1d (ORCPT ); Thu, 27 Oct 2016 18:27:33 -0400 Received: from mail-qk0-f179.google.com ([209.85.220.179]:33913 "EHLO mail-qk0-f179.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S964947AbcJ0W1a (ORCPT ); Thu, 27 Oct 2016 18:27:30 -0400 MIME-Version: 1.0 In-Reply-To: References: <1477474082-2846-1-git-send-email-paolo.valente@linaro.org> <20161026113443.GA13587@quack2.suse.cz> <4ed3e291-b3e5-5ee3-6838-58644bd3d99b@sandisk.com> <12386463.fJy0cVexVD@wuerfel> <20161026152955.GA21262@infradead.org> <3ebadbb8-9ac2-851a-66f9-c9db25713695@kernel.dk> <38156FA7-9A66-44DC-8D0C-28F149D1E49B@linaro.org> <09fc1e06-3fd6-b13d-0dd9-0edfb55b01d1@kernel.dk> <15ee2d0e-2d3a-81e2-9f83-f875e41bf388@kernel.dk> <1ac9b794-7e7f-0748-e4c8-a13034aecbc3@kernel.dk> From: Linus Walleij Date: Fri, 28 Oct 2016 00:27:28 +0200 Message-ID: Subject: Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler To: Jens Axboe Cc: Ulf Hansson , Paolo Valente , Christoph Hellwig , Arnd Bergmann , Bart Van Assche , Jan Kara , Tejun Heo , linux-block@vger.kernel.org, Linux-Kernal , Mark Brown , Hannes Reinecke , Grant Likely , James Bottomley Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1376 Lines: 32 On Thu, Oct 27, 2016 at 11:08 PM, Jens Axboe wrote: > blk-mq has evolved to support a variety of devices, there's nothing > special about mmc that can't work well within that framework. There is. Read mmc_queue_thread() in drivers/mmc/card/queue.c This repeatedly calls req = blk_fetch_request(q);, starting one request and then getting the next one off the queue, including reading a few NULL requests off the end of the queue (to satisfy the semantics of its state machine. It then preprocess each request by esstially calling .pre() and .post() hooks all the way down to the driver, flushing its mapped sglist from CPU to DMA device memory (not a problem on x86 and other DMA-coherent archs, but a big win on the incoherent ones). In the attempt that was posted recently this is achieved by lying and saying the HW queue is two items deep and eating requests off that queue calling pre/post on them. But as there actually exist MMC cards with command queueing, this would become hopeless to handle, the hw queue depth has to reflect the real depth. What we need is for the block core to call pre/post hooks on each request. The "only" thing that doesn't work well after that is that CFQ is no longer in action, which will have interesting effects on MMC throughput in any fio-like stress test as it is mostly single-hw-queue. Yours, Linus Walleij