Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp2766202ybi; Sun, 2 Jun 2019 00:36:26 -0700 (PDT) X-Google-Smtp-Source: APXvYqwYngbCbZIbRM9jaIKvgqLGzYgwuu4LW6nFegqS+Kk4HP9z3hOS+BENeXGYVRGoy5SwiknJ X-Received: by 2002:a17:902:b18c:: with SMTP id s12mr20975879plr.181.1559460986842; Sun, 02 Jun 2019 00:36:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1559460986; cv=none; d=google.com; s=arc-20160816; b=okoIb0MkEbWCpH5hRM1zOsINNiQ5u/tas9F1LKacC0Ho/bkBOeuVzOh1BVemOGJsff NDxvUts/L0Wh3NN5DxiyK1J4khE65tbEjtJO5tAMy/cX2awi62gvho0AoCdQlRk3rH7V h16d47y40ruBSz+2VNlu3oYgHHurK5VC80Lt/KSEVTADhQjiqV3v1DFf2M6ictbqTRJ/ a0hOu4RNWGJ4waRpmR8pmdRvED95/tayYJyXSHDtVQMwpAuJIod7LBoHA//kS+vPhXPC Yuc/Rg4tBj5KwiYM6QohspKIRyfgz1sV2fUEh8BxbAyt+ETff0EdKvnTnBsIN8gF2Fgp KyVw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=fsDEhlHTV0cg0WemlL2x3W/6HHz8e6VS2AuHCLCMGro=; b=U12+ft2HfcxQlGcoLETHZQOiE+mh+DLK8N37WHTQXvfUOEtumlZldGygedUBX5Wkgk 5XV7NOvLuPHUM6piAU5V90H+1r5/+p/IxsNi+AI05ejAUny/Cm2INTqOgH7i/iXqKtvQ 8/yfDvrQuded5sa5XUNzjkKmo4hK94mBztM9ZhKGeNrXk1css4gy2uVcWOU76nx4ZyBQ TpWMXMCPTKUuVrB2nYpAGCrcC4x1fcAp2OWC6U0xcdTmXYXULayJrABEEjHg7aEwcT84 /xHg42FTfzTKIANmzLQTKtBSs2ttYdadtW3+IBJhxtELc5JLxxfT0Gdbl5RStic3bHyr /GeA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=csail.mit.edu Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h13si8280439pgq.51.2019.06.02.00.36.11; Sun, 02 Jun 2019 00:36:26 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=csail.mit.edu Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726261AbfFBHfK (ORCPT + 99 others); Sun, 2 Jun 2019 03:35:10 -0400 Received: from outgoing-stata.csail.mit.edu ([128.30.2.210]:45621 "EHLO outgoing-stata.csail.mit.edu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725875AbfFBHfJ (ORCPT ); Sun, 2 Jun 2019 03:35:09 -0400 Received: from c-73-193-85-113.hsd1.wa.comcast.net ([73.193.85.113] helo=srivatsab-a01.vmware.com) by outgoing-stata.csail.mit.edu with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1hXKXo-000XZY-I5; Sun, 02 Jun 2019 03:04:40 -0400 Subject: Re: CFQ idling kills I/O performance on ext4 with blkio cgroup controller To: Paolo Valente Cc: linux-fsdevel@vger.kernel.org, linux-block , linux-ext4@vger.kernel.org, cgroups@vger.kernel.org, kernel list , Jens Axboe , Jan Kara , Jeff Moyer , Theodore Ts'o , amakhalov@vmware.com, anishs@vmware.com, srivatsab@vmware.com, Ulf Hansson , Linus Walleij References: <8d72fcf7-bbb4-2965-1a06-e9fc177a8938@csail.mit.edu> <5B6570A2-541A-4CF8-98E0-979EA6E3717D@linaro.org> <2CB39B34-21EE-4A95-A073-8633CF2D187C@linaro.org> <0e3fdf31-70d9-26eb-7b42-2795d4b03722@csail.mit.edu> <686D6469-9DE7-4738-B92A-002144C3E63E@linaro.org> <01d55216-5718-767a-e1e6-aadc67b632f4@csail.mit.edu> <6FE0A98F-1E3D-4EF6-8B38-2C85741924A4@linaro.org> <2A58C239-EF3F-422B-8D87-E7A3B500C57C@linaro.org> <5b71028c-72f0-73dd-0cd5-f28ff298a0a3@csail.mit.edu> <0d6e3c02-1952-2177-02d7-10ebeb133940@csail.mit.edu> <7B74A790-BD98-412B-ADAB-3B513FB1944E@linaro.org> From: "Srivatsa S. Bhat" Message-ID: <6a6f4aa4-fc95-f132-55b2-224ff52bd2d8@csail.mit.edu> Date: Sun, 2 Jun 2019 00:04:34 -0700 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 MIME-Version: 1.0 In-Reply-To: <7B74A790-BD98-412B-ADAB-3B513FB1944E@linaro.org> Content-Type: text/plain; charset=windows-1252 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 5/30/19 3:45 AM, Paolo Valente wrote: > > >> Il giorno 30 mag 2019, alle ore 10:29, Srivatsa S. Bhat ha scritto: >> [...] >> >> Your fix held up well under my testing :) >> > > Great! > >> As for throughput, with low_latency = 1, I get around 1.4 MB/s with >> bfq (vs 1.6 MB/s with mq-deadline). This is a huge improvement >> compared to what it was before (70 KB/s). >> > > That's beautiful news! > > So, now we have the best of the two worlds: maximum throughput and > total control on I/O (including minimum latency for interactive and > soft real-time applications). Besides, no manual configuration > needed. Of course, this holds unless/until you find other flaws ... ;) > Indeed, that's awesome! :) >> With tracing on, the throughput is a bit lower (as expected I guess), >> about 1 MB/s, and the corresponding trace file >> (trace-waker-detection-1MBps) is available at: >> >> https://www.dropbox.com/s/3roycp1zwk372zo/bfq-traces.tar.gz?dl=0 >> > > Thank you for the new trace. I've analyzed it carefully, and, as I > imagined, this residual 12% throughput loss is due to a couple of > heuristics that occasionally get something wrong. Most likely, ~12% > is the worst-case loss, and if one repeats the tests, the loss may be > much lower in some runs. > Ah, I see. > I think it is very hard to eliminate this fluctuation while keeping > full I/O control. But, who knows, I might have some lucky idea in the > future. > :) > At any rate, since you pointed out that you are interested in > out-of-the-box performance, let me complete the context: in case > low_latency is left set, one gets, in return for this 12% loss, > a) at least 1000% higher responsiveness, e.g., 1000% lower start-up > times of applications under load [1]; > b) 500-1000% higher throughput in multi-client server workloads, as I > already pointed out [2]. > I'm very happy that you could solve the problem without having to compromise on any of the performance characteristics/features of BFQ! > I'm going to prepare complete patches. In addition, if ok for you, > I'll report these results on the bug you created. Then I guess we can > close it. > Sounds great! > [1] https://algo.ing.unimo.it/people/paolo/disk_sched/results.php > [2] https://www.linaro.org/blog/io-bandwidth-management-for-production-quality-services/ > >> Thank you so much for your tireless efforts in fixing this issue! >> > > I did enjoy working on this with you: your test case and your support > enabled me to make important improvements. So, thank you very much > for your collaboration so far, > Paolo My pleasure! :) Regards, Srivatsa VMware Photon OS