Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp51726imm; Tue, 31 Jul 2018 13:39:03 -0700 (PDT) X-Google-Smtp-Source: AAOMgpeGlaXmmzfE4QaZJaZgNaR0ysP8yeLeOdAqfolT5vyjPZiLTn70uBsq146UVilIxEWhRf5e X-Received: by 2002:a63:4f1a:: with SMTP id d26-v6mr22328765pgb.121.1533069543239; Tue, 31 Jul 2018 13:39:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533069543; cv=none; d=google.com; s=arc-20160816; b=Yi7HEZXrRhw9fxxbuUohQEjDOzeBuoa7G5Xb26dgwVA69aTaEUIycPD4mXFGwadI6U mP2Rq02NlY7TuuLOx9ql1DzLNFpVfMJWqURzrDmhs3PoGqdP6o8s3pkV7SBTg/LkgFh7 sEQ2yf8w8PaaSifpk/CU9P6L/VUFZ+xYIv9DdkN86LwSJ2oAQEaUXH8/X3lfvRiQoTBP wGdI50prHH6Au1fgTkaqHihB5FjcGdB7rbApA1KErOto+O/M8aYrtwIzuQXFpmAvTEEw dv3uUiaai1OQGdWfrE+oNULj7GS75kcG6IcF6dlNS6WyjSJ1a7jaE7rt9W54CuyKeF7u H6Jw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :dkim-signature:arc-authentication-results; bh=81McRLmZx70bih4WIDAEBJeDlqqKm3OXFTNXgzvSF7M=; b=eGKWoDkcDZGSQzI+FpNacNUit1dbZ0gamtW91yhi7wHU80wFM4QhP6Id7r2H+83yxp lP+22PtV0HvVAE0PDD2NJzxgBIngxR1S0M3N3jLGYyknPin5zde3FMMvdd1VJF/QnqiZ Fqm5RXnbkWd/9V257l/vRNPRRm58BuYyvbTZT5cvaLak6h2MDMX49qAx0CbNkCh9o9Z0 RMm5h+axGgqZKbM2ll8SHCidIqJbgNvaBgs+TZjLwryW4aBaVvgBcrAPnVUoGd1Pu1MX 8sPT3FdzPyNLBEly5keZCHWpuqMd9I5lvrydo3WGEMMGJyWW+R1UrgKIi2q4WzCxBEsJ aIqA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=Gsp67JTN; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w16-v6si13117615ply.462.2018.07.31.13.38.38; Tue, 31 Jul 2018 13:39:03 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=Gsp67JTN; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727006AbeGaWTK (ORCPT + 99 others); Tue, 31 Jul 2018 18:19:10 -0400 Received: from mail-pf1-f193.google.com ([209.85.210.193]:42138 "EHLO mail-pf1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726580AbeGaWTK (ORCPT ); Tue, 31 Jul 2018 18:19:10 -0400 Received: by mail-pf1-f193.google.com with SMTP id l9-v6so6654478pff.9; Tue, 31 Jul 2018 13:37:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=81McRLmZx70bih4WIDAEBJeDlqqKm3OXFTNXgzvSF7M=; b=Gsp67JTNMxXk83q6rz5XfIZBrp/JQk8l1Q901PlFONiGtdSgaci6AHSNKYfydrBY/l 1b12ILZtX0v5I8rmmz3WYneKx6iVzMuNHmO92faaiIp/wkovkxCnYS8b5iAx21ciLkNd f2Q3ifJRDTLOXmrdWDwohorLZ9KxxqxjmSqCOfUGexIo+LD9U/qa85/GHyqGiXiM9OFV ys7amgO/MYaJi73pDSdRor/o4VaQTvmuoOGHJH0W1wpbSgSdWgXjVa8H9JZ99Mt3Dtpa 2fr3GA1RFaSJXY9rCk+/6Loj6hIdjTtMou6KdJJsu0DzZsKpczNK4zv+jBAFEztycogM 0UsA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=81McRLmZx70bih4WIDAEBJeDlqqKm3OXFTNXgzvSF7M=; b=kGk8rovTf1G5UUFtOy2mCGevjnfLv/PGU05tNLJTN8IP2zGaXpVDExp777fT+P539z oJuJO9YTkQnOzQg4vRZP0qjjKP8lABsuNU83+dSRBEId4VkTgePtW6p7efl6CsqD3WR1 VqoeEqz0KyFhrMQC7kjmmuRzjeu3kvf5GYW2enk3OWtrecicRcbC9TdKi2Vbc4LRNSDz KVtQAkmvAOGQWkE2cuNhtEQqObagjqcraA1cVXOoAPImWATGrNadFHKx5U+cVHUa9dHE Yck9O9DvRVtQfm/GxbfuCAU4xikLp6JIp49WvZNEQgdNqr1OT+1vKvBx4rKiEEaQxO/W jV0Q== X-Gm-Message-State: AOUpUlFk+wkh8plORN1A8Y6eHQFzOK5+hbyGrqe8d4gWAeSPi8p3NoIN FnMBA+kf2PvXZM737MNHWM4= X-Received: by 2002:a62:41d6:: with SMTP id g83-v6mr23536743pfd.219.1533069424148; Tue, 31 Jul 2018 13:37:04 -0700 (PDT) Received: from dennisz-mbp.thefacebook.com ([199.201.64.137]) by smtp.gmail.com with ESMTPSA id l127-v6sm10501965pfc.55.2018.07.31.13.37.01 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 31 Jul 2018 13:37:03 -0700 (PDT) From: Dennis Zhou To: Tejun Heo , Jens Axboe , Josef Bacik Cc: kernel-team@fb.com, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, Johannes Weiner , "Dennis Zhou (Facebook)" Subject: [PATCH] block: make iolatency avg_lat exponentially decay Date: Tue, 31 Jul 2018 13:36:47 -0700 Message-Id: <20180731203647.19864-1-dennisszhou@gmail.com> X-Mailer: git-send-email 2.13.5 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: "Dennis Zhou (Facebook)" Currently, avg_lat is calculated by accumulating the mean of every window in a long running cumulative average. As time goes on, the metric becomes less and less useful due to the accumulated history. This patch reuses the same calculation done in load averages to make the avg_lat metric more lively. Unlike load averages, the avg only advances when a window elapses (due to an io). Idle periods extend the most recent window. Bucketing is used to limit the history of avg_lat by binding it to the window size. So, the window range for 1/exp (decay rate) is [1 min, 2.5 min) when windows elapse immediately. Signed-off-by: Dennis Zhou --- block/blk-iolatency.c | 45 ++++++++++++++++++++++++++++++------------- 1 file changed, 32 insertions(+), 13 deletions(-) diff --git a/block/blk-iolatency.c b/block/blk-iolatency.c index bb59b2929e0d..1db3244eac05 100644 --- a/block/blk-iolatency.c +++ b/block/blk-iolatency.c @@ -69,6 +69,7 @@ #include #include #include +#include #include #include #include "blk-rq-qos.h" @@ -127,7 +128,6 @@ struct iolatency_grp { /* total running average of our io latency. */ u64 total_lat_avg; - u64 total_lat_nr; /* Our current number of IO's for the last summation. */ u64 nr_samples; @@ -135,6 +135,24 @@ struct iolatency_grp { struct child_latency_info child_lat; }; +#define BLKIOLATENCY_MIN_WIN_SIZE (100 * NSEC_PER_MSEC) +#define BLKIOLATENCY_MAX_WIN_SIZE NSEC_PER_SEC +/* + * These are the constants used to fake the fixed-point moving average + * calculation just like load average. The latency window is bucketed to + * try to approximately calculate average latency for the last 1 minute. + */ +#define BLKIOLATENCY_NR_EXP_FACTORS 5 +#define BLKIOLATENCY_EXP_BUCKET_SIZE (BLKIOLATENCY_MAX_WIN_SIZE / \ + (BLKIOLATENCY_NR_EXP_FACTORS - 1)) +static const u64 iolatency_exp_factors[BLKIOLATENCY_NR_EXP_FACTORS] = { + 2045, // exp(1/600) - 600 samples + 2039, // exp(1/240) - 240 samples + 2031, // exp(1/120) - 120 samples + 2023, // exp(1/80) - 80 samples + 2014, // exp(1/60) - 60 samples +}; + static inline struct iolatency_grp *pd_to_lat(struct blkg_policy_data *pd) { return pd ? container_of(pd, struct iolatency_grp, pd) : NULL; @@ -462,7 +480,7 @@ static void iolatency_check_latencies(struct iolatency_grp *iolat, u64 now) struct child_latency_info *lat_info; struct blk_rq_stat stat; unsigned long flags; - int cpu; + int cpu, exp_idx; blk_rq_stat_init(&stat); preempt_disable(); @@ -480,11 +498,10 @@ static void iolatency_check_latencies(struct iolatency_grp *iolat, u64 now) lat_info = &parent->child_lat; - iolat->total_lat_avg = - div64_u64((iolat->total_lat_avg * iolat->total_lat_nr) + - stat.mean, iolat->total_lat_nr + 1); - - iolat->total_lat_nr++; + exp_idx = min_t(int, BLKIOLATENCY_NR_EXP_FACTORS - 1, + iolat->cur_win_nsec / BLKIOLATENCY_EXP_BUCKET_SIZE); + CALC_LOAD(iolat->total_lat_avg, iolatency_exp_factors[exp_idx], + stat.mean); /* Everything is ok and we don't need to adjust the scale. */ if (stat.mean <= iolat->min_lat_nsec && @@ -700,8 +717,9 @@ static void iolatency_set_min_lat_nsec(struct blkcg_gq *blkg, u64 val) u64 oldval = iolat->min_lat_nsec; iolat->min_lat_nsec = val; - iolat->cur_win_nsec = max_t(u64, val << 4, 100 * NSEC_PER_MSEC); - iolat->cur_win_nsec = min_t(u64, iolat->cur_win_nsec, NSEC_PER_SEC); + iolat->cur_win_nsec = max_t(u64, val << 4, BLKIOLATENCY_MIN_WIN_SIZE); + iolat->cur_win_nsec = min_t(u64, iolat->cur_win_nsec, + BLKIOLATENCY_MAX_WIN_SIZE); if (!oldval && val) atomic_inc(&blkiolat->enabled); @@ -811,13 +829,14 @@ static size_t iolatency_pd_stat(struct blkg_policy_data *pd, char *buf, { struct iolatency_grp *iolat = pd_to_lat(pd); unsigned long long avg_lat = div64_u64(iolat->total_lat_avg, NSEC_PER_USEC); + unsigned long long cur_win = div64_u64(iolat->cur_win_nsec, NSEC_PER_MSEC); if (iolat->rq_depth.max_depth == UINT_MAX) - return scnprintf(buf, size, " depth=max avg_lat=%llu", - avg_lat); + return scnprintf(buf, size, " depth=max avg_lat=%llu win=%llu", + avg_lat, cur_win); - return scnprintf(buf, size, " depth=%u avg_lat=%llu", - iolat->rq_depth.max_depth, avg_lat); + return scnprintf(buf, size, " depth=%u avg_lat=%llu win=%llu", + iolat->rq_depth.max_depth, avg_lat, cur_win); } -- 2.17.1