Received: by 10.192.165.148 with SMTP id m20csp4730002imm; Tue, 24 Apr 2018 07:32:33 -0700 (PDT) X-Google-Smtp-Source: AIpwx4/1u6AB2NHnXl1vXxq7dDFMvSEIDEExLgOKBiGymedyKlETI8aKup9nsNkQ2kbFA3m7C22p X-Received: by 2002:a17:902:a603:: with SMTP id u3-v6mr25403200plq.214.1524580353883; Tue, 24 Apr 2018 07:32:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524580353; cv=none; d=google.com; s=arc-20160816; b=t4e4iuggDCjAvsEXfkHES1O/cdk7lIHXb5+1fNEtDBaq73oohYzPpFSZlEKLwT5nLZ guUeUwBwwTqK4OPSVA6Ae29aY0FuesnBYN2qWn5gBO+e4j9vDnLbEciT5mWXPTj1O3yn nL1EfS7/VqWJSOqaV0djyCTzMbiCbZWA5NxIkK1X0A0SBRYWafcm5T/SsFXNYT1eFluP vhbUiVY0XgFUigVDvSy6AIWOQ/Wrc3IjJP+/AfQWXxJ1STeYwJ7vTIAt8vx3ER23HI3m 9LqIcElkMQHYmkvHDgKkaiSlv4iI/foKH4pWXMgDKmEmBH3wBMr1xC9aD4ZCq8v/U3vr IO3Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:to:references:message-id :content-transfer-encoding:cc:date:in-reply-to:from:subject :mime-version:dkim-signature:arc-authentication-results; bh=wcOFRFSH3150oylM8ky81N4hfOx9hHn/dxANVuNlZRs=; b=KDuMd+kcOW9gz0sQrJZj2Q7LyfyR8Yb0W8ayu+d87qwN6LNONc0a+VYrVnGD/sYJ9C lNS/8CNxnJv9ypRDLV+I+Aojr16tQ8bKQSF9rsOUgagFXBELbF+w2/NCn8QrFI7YlO96 vl6wpHyKFkV9qbaN2AMD1TnXVFx7VXaSJtc+zHg7Ja8pWtjVnz4S6bRDWnrvjbGH272i JRxtkmRtVtC8VeC9U8Aw2SHCkmacpaXnm1yuwa6aO5kZkhDE7TbFKtGcwgMhwgt3tQce KDaLxJh8q6/GkJXcs1+VxMDQJ5K+fPDcK3dKXy3Efq5ZrYXz8L9oc0K4nR7duOZ5cYqL wOSg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=AiAf2qek; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z84si14076451pfi.240.2018.04.24.07.32.19; Tue, 24 Apr 2018 07:32:33 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=AiAf2qek; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757345AbeDXMNB (ORCPT + 99 others); Tue, 24 Apr 2018 08:13:01 -0400 Received: from mail-wm0-f54.google.com ([74.125.82.54]:35730 "EHLO mail-wm0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756858AbeDXMM4 (ORCPT ); Tue, 24 Apr 2018 08:12:56 -0400 Received: by mail-wm0-f54.google.com with SMTP id o78so435814wmg.0 for ; Tue, 24 Apr 2018 05:12:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=wcOFRFSH3150oylM8ky81N4hfOx9hHn/dxANVuNlZRs=; b=AiAf2qek5hAeQd5bnj7m+mm1CkjsGkDC2obtoFYnzl//ejr5qi4dIpN+JYgnSYv6AB whd5rbQxaGnkLJd6rOH8Z8v/Svx2BDg0TRs9FZzu+jqO27ifZ9GFTUC/KghMdBYbZp8F az8pvvXhkLs8WHWxfBMBzeX5fXtFZ7xwJjCyM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=wcOFRFSH3150oylM8ky81N4hfOx9hHn/dxANVuNlZRs=; b=ehafTMDGMN9HkBIusMA1dC8wgLVRXbd9dJL1/qaCm0FuhswmEL4I4Aoo/1aYyt6OeK j3Tlva7zNTE4K2uGoNqyV6wQSmD+MDKoXQ/jSKvZZfV776UoIiXe7Yd9Q7AS1aydFjFs bFlqOhNbx3xEx+tXD4YIAYqGHT43INiyiseq6pmpFIJGWoUIeSmMsQ+KpDFBjwEmU0X2 9B2uTl4A0HVLp1fJLH9ZN0Ich0ExHop+Lc4cdju/LfG06O3qHemCfMo63TlNTkGKPXQh VhQqO1SowSX6dDv4nfQ+WgLuZUGKoGGZS1yjHWP44adRiw4I5GyuC2d8ivJ+fEH/M+96 bIqQ== X-Gm-Message-State: ALQs6tCHnAZDCs4REPdAfJoBDKkkf1LDfWGFFRbKyEcnbKz/BhYnylQv 3JIfrKLOKbMNHwGDwbGD+0pt2w== X-Received: by 10.28.74.133 with SMTP id n5mr12886509wmi.31.1524571973598; Tue, 24 Apr 2018 05:12:53 -0700 (PDT) Received: from [192.168.0.104] (146-241-23-121.dyn.eolo.it. [146.241.23.121]) by smtp.gmail.com with ESMTPSA id 11sm11046327wmd.26.2018.04.24.05.12.51 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 24 Apr 2018 05:12:52 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 11.3 \(3445.6.18\)) Subject: Re: testing io.low limit for blk-throttle From: Paolo Valente In-Reply-To: <18accc1e-c7b3-86a7-091b-1d4b631fcd4a@gmail.com> Date: Tue, 24 Apr 2018 14:12:51 +0200 Cc: linux-block , Jens Axboe , Shaohua Li , Mark Brown , Linus Walleij , Ulf Hansson , LKML , Tejun Heo Content-Transfer-Encoding: quoted-printable Message-Id: <536A1B1D-575F-4193-ADA6-BA832AEC7179@linaro.org> References: <4c6b86d9-1668-43c3-c159-e6e23ffb04b4@gmail.com> <18accc1e-c7b3-86a7-091b-1d4b631fcd4a@gmail.com> To: Joseph Qi X-Mailer: Apple Mail (2.3445.6.18) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > Il giorno 23 apr 2018, alle ore 11:01, Joseph Qi = ha scritto: >=20 >=20 >=20 > On 18/4/23 15:35, Paolo Valente wrote: >>=20 >>=20 >>> Il giorno 23 apr 2018, alle ore 08:05, Joseph Qi = ha scritto: >>>=20 >>> Hi Paolo, >>=20 >> Hi Joseph, >> thanks for chiming in. >>=20 >>> What's your idle and latency config? >>=20 >> I didn't set them at all, as the only (explicit) requirement in my >> basic test is that one of the group is guaranteed a minimum bps. >>=20 >>=20 >>> IMO, io.low will allow others run more bandwidth if cgroup's average >>> idle time is high or latency is low. >>=20 >> What you say here makes me think that I simply misunderstood the >> purpose of io.low. So, here is my problem/question: "I only need to >> guarantee at least a minimum bandwidth, in bps, to a group. Is the >> io.low limit the way to go?" >>=20 >> I know that I can use just io.max (unless I misunderstood the goal of >> io.max too :( ), but my extra purpose would be to not waste bandwidth >> when some group is idle. Yet, as for now, io.low is not working even >> for the first, simpler goal, i.e., guaranteeing a minimum bandwidth = to >> one group when all groups are active. >>=20 >> Am I getting something wrong? >>=20 >> Otherwise, if there are some special values for idle and latency >> parameters that would make throttle work for my test, I'll be of >> course happy to try them. >>=20 > I think you can try idle time with 1000us for all cgroups, and latency > target 100us for cgroup with low limit 100MB/s and 2000us for cgroups > with low limit 10MB/s. That means cgroup with low latency target will > be preferred. > BTW, from my expeierence the parameters are not easy to set because > they are strongly correlated to the cgroup IO behavior. >=20 +Tejun (I guess he might be interested in the results below) Hi Joseph, thanks for chiming in. Your suggestion did work! At first, I thought I had also understood the use of latency from the outcome of your suggestion: "want low limit really guaranteed for a group? set target latency to a low value for it." But then, as a crosscheck, I repeated the same exact test, but reversing target latencies: I gave 2000 to the interfered (the group with 100MB/s limit) and 100 to the interferers. And the interfered still got more than 100MB/s! So I exaggerated: 20000 to the interfered. Same outcome :( I tried really many other combinations, to try to figure this out, but results seemed more or less random w.r.t. to latency values. I didn't even start to test different values for idle. So, the only sound lesson that I seem to have learned is: if I want low limits to be enforced, I have to set target latency and idle explicitly. The actual values of latencies matter little, or not at all. At least this holds for my simple tests. At any rate, thanks to your help, Joseph, I could move to the most interesting part for me: how effective is blk-throttle with low limits? I could well be wrong again, but my results do not seem that good. With the simplest type of non-toy example I considered, I recorded throughput losses, apparently caused mainly by blk-throttle, and ranging from 64% to 75%. Here is a worst-case example. For each step, I'm reporting below the command by which you can reproduce that step with the thr-lat-with-interference benchmark of the S suite [1]. I just split bandwidth equally among five groups, on my SSD. The device showed a peak rate of ~515MB/s in this test, so I set rpbs to 100MB/s for each group (and tried various values, and combinations of values, for the target latency, without any effect on the results). To begin, I made every group do sequential reads. Everything worked perfectly fine. But then I made one group do random I/O [2], and troubles began. Even if the group doing random I/O was given a target latency of 100usec (or lower), while the other had a target latency of 2000usec, the poor random-I/O group got only 4.7 MB/s! (A single process doing 4k sync random I/O reaches 25MB/s on my SSD.) I guess things broke because low limits did not comply any longer with the lower speed that device reached with the new, mixed workload: the device reached 376MB/s, while the sum of the low limits was 500MB/s. BTW the 'fault' for this loss of throughput was not only of the device and the workload: if I switched throttling off, then the device still reached its peak rate, although granting only 1.3MB/s to the random-I/O group. So, to comply with the 376MB/s, I lowered the low limits to 74MB/s per group (to avoid a too tight 75MB/s) [3]. A little better: the random-I/O group got 7.2 MB/s. But the total throughput went down further, to 289MB/s, and became again lower than the sum of the low limits. Most certainly, this time the throughput went down mainly because blk-throttling was serving the random I/O more than before. To make a long story short, I arrived to setting just 12MB/s as low limit for each group [4]. The random-I/O group was finally happy, with a revitalizing 12.77MB/s. But the total throughput dropped down to 127MB/s, i.e., ~25% of the peak rate of the device. Now the 'fault' for the throughput loss seemed undoubtedly of blk-throttle. The latter was evidently over-throttling some group. To sum up, for my device, 12MB/s seems to be the highest value for which low limits can be guaranteed. But setting these limits entails a high cost: if just one group really does random I/O, then 75% of the throughput is lost. There would be other issues too. For example, 12MB/s might be too little for the needs of some group in some time period. This fact would make it extremely difficult, if ever possible, to set low limits that comply with the needs of more dynamic (and probably more realistic) workloads than the above one. I think this is all, sorry for the long mail, I tried to shrink it as much as possible. Looking forward to some feedback. Thanks, Paolo [1] https://github.com/Algodev-github/S [2] sudo ./thr-lat-with-interference.sh -b t -n 4 -w 100M -W 100M -t = randread -L 2000 [3] sudo ./thr-lat-with-interference.sh -b t -n 4 -w 74M -W 74M -t = randread -L 2000 [4] sudo ./thr-lat-with-interference.sh -b t -n 4 -w 12M -W 12M -t = randread -L 2000