Received: by 2002:ac0:a581:0:0:0:0:0 with SMTP id m1-v6csp1284260imm; Fri, 29 Jun 2018 15:07:32 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJ3i1x1wxO3gQ2P0uKYXs3bpKaSZjSbiZKhK6MPLFW6dHPQ47PkG2qL+1HIspyGrsjD21VC X-Received: by 2002:a17:902:3303:: with SMTP id a3-v6mr16588631plc.209.1530310052339; Fri, 29 Jun 2018 15:07:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1530310052; cv=none; d=google.com; s=arc-20160816; b=PpD2SYyV2981eHswroAJWo0wB+nAAt5NbZE9h7EPYhx+NjJwPEPejDWj3AgVzTIS9J OMFHfkM73CBeOc9hZxgNrxOY142RKmhriUPHwGpxxsnYSFqlmSs9UYYgNcbKIRon6XZJ zlJdctzWOkAAcY9NqrqZXbwgpvBmwuKTYGJJRytaAnr8wyNyyRnBN5ehV/K6S+mK6YK6 oA9zj5aVb2lJS3IcR78zw1R/fDTvTCpbV+1wtxp9aLmRJsYL9ySazJc3wQ+ThymeWdtc lGh9yZI/mTCy4vzxic+j99GqJZFhjAcQhEw/Bd118PBr/mBKN8GLYCXuBxVm8BfEKF2Z 59sw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=JjIUp40ypCKRJkdye4dqjgFkG20bMk9+3pEGTaE6cTA=; b=L0Gfwnd3npXTYbyycA0GsBiLXGXBf1zBKseXA0M1RHQ7IEHZIFNuU/GsMV0/87Ht8E 32wHhbA5ElEdOD/oDNDVhTCoObIsFLCt1DA9Wz2m6Gej2AYQ0qz+e+6oMgoWJNuO6Iuu OkEOM3QcCqJl8N6viFfBhCPqIFitSnKFJhTZLoVWgUQeClrg+qr6+YWiLYKiZlHdmH8Z lEosvbiPy4bctgXPQKvsXvU8C0nFUNmwMD63hFovv3NPeVJ7SubJF2UwNx/94AX8PW/B uEhjXthPJcsBv+OaP7kOTXGRLvqtuVb+MijvFNzovdgBuvnu+HScoNZJIE+yrwzWaB9r Cg5w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@toxicpanda-com.20150623.gappssmtp.com header.s=20150623 header.b=owAvGJA9; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f40-v6si10243049plb.504.2018.06.29.15.07.15; Fri, 29 Jun 2018 15:07:32 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@toxicpanda-com.20150623.gappssmtp.com header.s=20150623 header.b=owAvGJA9; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S936613AbeF2T0d (ORCPT + 99 others); Fri, 29 Jun 2018 15:26:33 -0400 Received: from mail-qt0-f196.google.com ([209.85.216.196]:44889 "EHLO mail-qt0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S936208AbeF2T0E (ORCPT ); Fri, 29 Jun 2018 15:26:04 -0400 Received: by mail-qt0-f196.google.com with SMTP id 92-v6so8820341qta.11 for ; Fri, 29 Jun 2018 12:26:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=toxicpanda-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=JjIUp40ypCKRJkdye4dqjgFkG20bMk9+3pEGTaE6cTA=; b=owAvGJA9bph6dhuaAghVwoQh04ghO8wrTaAjkooMt1lvadx/s4D3KeNCXk0o0N0FTM fd5XcqCZf0YzQA98WeJ7h3E1AV8DshQtiZmkQKWael1n9S4yX3lnVNbJwGFYtNUp4KaQ 0BRqSI8PobSQ+nWHHs6vUyJpSAjDhD/GHZdmuhQvq2CX/26FSbl8DAVh1OodCYdP8wru 6/1LyxhOdsmZpI/17MO8vw75Hzvyh3Y7pI0eqYQw8bXcp09MXqQ2DiiUqVVFbGw8oOoc OgOOcjFWhlhVsiQztAh0KFAPccZwKB9batRzGgCUd9QRXt7rNhompz9qPt8W/kQub/Qf VbmA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=JjIUp40ypCKRJkdye4dqjgFkG20bMk9+3pEGTaE6cTA=; b=PYK610otz2G8VAmnjZjqBFlz6B7H+uP8zLHJ2vjhOrcf9kZ83K8QkNSWA+rX+1bkc2 SeNdJBsPRfi6hFAygDPEN5spBDqhi1PXKhe5B/Xfd9C23awEhUnWPPw+BHf+RhowSiak StzwqYFVZPA+6t+fUrNU1DnYhzNrNgZ6JUHhmVd4uUoK8jAAW+a7UqUEDuRVercYiM77 3zlnfcIcf0xhWWJzfixKWf798gFRxsfqDEb5qcFxl+OGf1kE98Rj3V6oHgQzmKJafeio Wxj+SlHjLU6ec/0wpsyrU+tqerbaSZwLFmhX7Si1zY0OoqzX3kGg1i4AJqA7Vj+/jgLT vq1g== X-Gm-Message-State: APt69E0plUVBxYOb5q5VU3HTmfecn+Qr/Jhrpv3YqPM8b1VzNSrvWHFF EsFyTtA/6AQFuLcoCQK/ynKiBw== X-Received: by 2002:a0c:bfd8:: with SMTP id u24-v6mr14519732qvj.158.1530300363509; Fri, 29 Jun 2018 12:26:03 -0700 (PDT) Received: from localhost ([107.15.81.208]) by smtp.gmail.com with ESMTPSA id b188-v6sm6503601qkf.71.2018.06.29.12.26.02 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 29 Jun 2018 12:26:03 -0700 (PDT) From: Josef Bacik To: axboe@kernel.dk, kernel-team@fb.com, linux-block@vger.kernel.org, akpm@linux-foundation.org, hannes@cmpxchg.org, tj@kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: Josef Bacik Subject: [PATCH 13/14] Documentation: add a doc for blk-iolatency Date: Fri, 29 Jun 2018 15:25:41 -0400 Message-Id: <20180629192542.26649-14-josef@toxicpanda.com> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180629192542.26649-1-josef@toxicpanda.com> References: <20180629192542.26649-1-josef@toxicpanda.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Josef Bacik A basic documentation to describe the interface, statistics, and behavior of io.latency. Signed-off-by: Josef Bacik --- Documentation/cgroup-v2.txt | 79 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 79 insertions(+) diff --git a/Documentation/cgroup-v2.txt b/Documentation/cgroup-v2.txt index 74cdeaed9f7a..1fd46969c938 100644 --- a/Documentation/cgroup-v2.txt +++ b/Documentation/cgroup-v2.txt @@ -51,6 +51,9 @@ v1 is available under Documentation/cgroup-v1/. 5-3. IO 5-3-1. IO Interface Files 5-3-2. Writeback + 5-3-3. IO Latency + 5-3-3-1. How IO Latency Throttling Works + 5-3-3-2. IO Latency Interface Files 5-4. PID 5-4-1. PID Interface Files 5-5. Device @@ -1395,6 +1398,82 @@ writeback as follows. vm.dirty[_background]_ratio. +IO Latency +~~~~~~~~~~ + +This is a cgroup v2 controller for IO workload protection. You provide a group +with a latency target, and if the average latency exceeds that target the +controller will throttle any peers that have a lower latency target than the +protected workload. + +The limits are only applied at the peer level in the hierarchy. This means that +in the diagram below, only groups A, B, and C will influence each other, and +groups D and F will influence each other. Group G will influence nobody. + + [root] + / | \ + A B C + / \ | + D F G + + +So the ideal way to configure this is to set io.latency in groups A, B, and C. +Generally you do not want to set a value lower than the latency your device +supports. Experiment to find the value that works best for your workload. +Start at higher than the expected latency for your device and watch the +total_lat_avg value in io.stat for your workload group to get an idea of the +latency you see during normal operation. Use this value as a basis for your +real setting, setting at 10-15% higher than the value in io.stat. +Experimentation is key here because total_lat_avg is a running total, so is the +"statistics" portion of "lies, damned lies, and statistics." + +How IO Latency Throttling Works +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +io.latency is work conserving; so as long as everybody is meeting their latency +target the controller doesn't do anything. Once a group starts missing its +target it begins throttling any peer group that has a higher target than itself. +This throttling takes 2 forms: + +- Queue depth throttling. This is the number of outstanding IO's a group is + allowed to have. We will clamp down relatively quickly, starting at no limit + and going all the way down to 1 IO at a time. + +- Artificial delay induction. There are certain types of IO that cannot be + throttled without possibly adversely affecting higher priority groups. This + includes swapping and metadata IO. These types of IO are allowed to occur + normally, however they are "charged" to the originating group. If the + originating group is being throttled you will see the use_delay and delay + fields in io.stat increase. The delay value is how many microseconds that are + being added to any process that runs in this group. Because this number can + grow quite large if there is a lot of swapping or metadata IO occurring we + limit the individual delay events to 1 second at a time. + +Once the victimized group starts meeting its latency target again it will start +unthrottling any peer groups that were throttled previously. If the victimized +group simply stops doing IO the global counter will unthrottle appropriately. + +IO Latency Interface Files +~~~~~~~~~~~~~~~~~~~~~~~~~~ + + io.latency + This takes a similar format as the other controllers. + + "MAJOR:MINOR target=