Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp152630imm; Thu, 30 Aug 2018 18:57:21 -0700 (PDT) X-Google-Smtp-Source: ANB0VdYkOjArWeQbMCGXrgXdgJL10rhNPs5SB5PJt80Zi4+pYKOMI8Bwsm4ZjXMqAMkqP8gvBW/x X-Received: by 2002:a17:902:561:: with SMTP id 88-v6mr12798012plf.320.1535680641111; Thu, 30 Aug 2018 18:57:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1535680641; cv=none; d=google.com; s=arc-20160816; b=L/3iN5puMZ7nzkmC31hXXfwQsTsuUTLqMwhaO32unw2xer1kOTPnurp6HjxlewMRo/ B7JP8xba6zZxvsxI+5Is8r6yPMRhLHdazADUuzoC3L99pQSRG//XAmsLWutrpqTW+ops X13D7s9KSfaaYW/jg716MhaW9/jW9apMMbOk4JPIwRQCA6rSF5+lXbwL9qpgXAcJDMp0 2bJ77kspTuL0kePLaj7yV/l9RN/SZZISCMsQKeQnI3APC3IbFMzoSqUPWyUcasL/CjIa dfb0haRlSM0HOK1pzV7ORJJrlRnq6hTcwRIEcouxhLXWiWNp3JjiLcid8/QhvNA549Ja 3r4w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=TO0jmspUoRjVVfIoVYshEwAx1ch9EaHAEw+E1Q58OTI=; b=IiBuATse8BPAUkaHvwcLHxWkhy+RqAUbkuwjzsshmylcbEY+FbX4QE0j6Ki6Sr44y3 HQCi+MzXrjLae8RxLXKOtdh33aDoPyx0rO7alylWWX3C2iIRO2vNlElknAiZu9GKiWhM +pE7HFbZc7U+5dd/Cj6zZExENr/DjGsJO7wpYc2L7d8UZ4H/zbkjS0T+GMzGh3HRFnS3 yOGCsFxflf63VDmzFUYHuYSFtguwNZimN+6CCZSxX+npWUGiNsl5J13DLvFEi6UTX2/7 cUZlSXCnGaG9bopvPYFdA9YvUbK0xZdgD7EI0fPucPjKelm5VxnkCX4CxRF8LbKvJuog 3SXA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=Shz0Q5Qi; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b69-v6si8292406pga.90.2018.08.30.18.57.06; Thu, 30 Aug 2018 18:57:21 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=Shz0Q5Qi; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727423AbeHaF7U (ORCPT + 99 others); Fri, 31 Aug 2018 01:59:20 -0400 Received: from mail-yb1-f195.google.com ([209.85.219.195]:33512 "EHLO mail-yb1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726727AbeHaF7T (ORCPT ); Fri, 31 Aug 2018 01:59:19 -0400 Received: by mail-yb1-f195.google.com with SMTP id d4-v6so330212ybl.0; Thu, 30 Aug 2018 18:54:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=TO0jmspUoRjVVfIoVYshEwAx1ch9EaHAEw+E1Q58OTI=; b=Shz0Q5Qi2MPI43f81Ff0S8Vkr8KQ+4rzAUBtHB7nXItV+cY2rT5ow5cPeZxOPNpzKZ OuXRIllcG3eSzNOluaqxO7dBoqyqs/LRscV2w5mj3dvPA34XPvmmEoCqDoCDrqZNxyKp jvXasFYWbhH8CcoRlr37KkC2FBUyyo8SdQsLkgDnc36f4XfUajFWumnRoUI+pSQjqCn0 4KYIX4910H88Pr9GSLX1mWgYL7Gu/juxAwXwjZl0j3sURsuLp0sCW1F3Ave/JLTmT4Bo UEdX0Gb+PYDUGGLx0l/aZNoTrqbnJ87I1DNwjY9jjkyLZQoW6JN+BCuwcnIsxwhJUDDT KhxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=TO0jmspUoRjVVfIoVYshEwAx1ch9EaHAEw+E1Q58OTI=; b=YoQnod3gzv79e+vL7318fK48Cng60d+i8xgvB+kDKj0edmdFLPePQxuIyfmsiUMjOi hS+F5sdtWS6HU/1dvmOQ8+ws8m1Anj5ssEsdgxqHqu6DjBHp9LQEaZTVGOi8Zxs5rH++ l4XAps1AQ1dO2VICE2QvN+DhARnwBbxu19npTS5LyuOXFZoJMg3vHUbvU22o9XuQW0nH wG+VRqqxbHagWNwV67t/tTjOfImuI1aNAnB88X+aphSTE6PVjV7fekhOI/NMZCRZJhAG vqLZygV8NupbcouMN3x+2ob5hrqgO7rkEvj4S3jSQI1QWOnntkN1imSt12R8TnjHIhlZ Eidg== X-Gm-Message-State: APzg51AHCrj6lU9/s8AVAMKbcFbC8Jmpn0V4gTyi8w9FHkjvb/rCl2xr EQdmsUSvTsb2YaoCvcT0eYEIfv0qe8I= X-Received: by 2002:a25:ad08:: with SMTP id y8-v6mr7435993ybi.253.1535680456533; Thu, 30 Aug 2018 18:54:16 -0700 (PDT) Received: from dennisz-mbp.thefacebook.com ([199.201.65.129]) by smtp.gmail.com with ESMTPSA id j70-v6sm3274084ywb.69.2018.08.30.18.54.15 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 30 Aug 2018 18:54:16 -0700 (PDT) From: Dennis Zhou To: Jens Axboe , Tejun Heo , Johannes Weiner , Josef Bacik Cc: kernel-team@fb.com, linux-block@vger.kernel.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, "Dennis Zhou (Facebook)" , Jiufei Xue , Joseph Qi Subject: [PATCH 01/15] Revert "blk-throttle: fix race between blkcg_bio_issue_check() and cgroup_rmdir()" Date: Thu, 30 Aug 2018 21:53:42 -0400 Message-Id: <20180831015356.69796-2-dennisszhou@gmail.com> X-Mailer: git-send-email 2.13.5 In-Reply-To: <20180831015356.69796-1-dennisszhou@gmail.com> References: <20180831015356.69796-1-dennisszhou@gmail.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: "Dennis Zhou (Facebook)" This reverts commit 4c6994806f708559c2812b73501406e21ae5dcd0. Destroying blkgs is tricky because of the nature of the relationship. A blkg should go away when either a blkcg or a request_queue goes away. However, blkg's pin the blkcg to ensure they remain valid. To break this cycle, when a blkcg is offlined, blkgs put back their css ref. This eventually lets css_free() get called which frees the blkcg. The above commit (4c6994806f70) breaks this order of events by trying to destroy blkgs in css_free(). As the blkgs still hold references to the blkcg, css_free() is never called. The race between blkcg_bio_issue_check() and cgroup_rmdir() will be addressed in the following patch by delaying destruction of a blkg until all writeback associated with the blkcg has been finished. Fixes: 4c6994806f70 ("blk-throttle: fix race between blkcg_bio_issue_check() and cgroup_rmdir()") Signed-off-by: Dennis Zhou Cc: Jiufei Xue Cc: Joseph Qi Cc: Tejun Heo Cc: Jens Axboe --- block/blk-cgroup.c | 78 ++++++++------------------------------ include/linux/blk-cgroup.h | 1 - 2 files changed, 16 insertions(+), 63 deletions(-) diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index 694595b29b8f..2998e4f095d1 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -310,28 +310,11 @@ struct blkcg_gq *blkg_lookup_create(struct blkcg *blkcg, } } -static void blkg_pd_offline(struct blkcg_gq *blkg) -{ - int i; - - lockdep_assert_held(blkg->q->queue_lock); - lockdep_assert_held(&blkg->blkcg->lock); - - for (i = 0; i < BLKCG_MAX_POLS; i++) { - struct blkcg_policy *pol = blkcg_policy[i]; - - if (blkg->pd[i] && !blkg->pd[i]->offline && - pol->pd_offline_fn) { - pol->pd_offline_fn(blkg->pd[i]); - blkg->pd[i]->offline = true; - } - } -} - static void blkg_destroy(struct blkcg_gq *blkg) { struct blkcg *blkcg = blkg->blkcg; struct blkcg_gq *parent = blkg->parent; + int i; lockdep_assert_held(blkg->q->queue_lock); lockdep_assert_held(&blkcg->lock); @@ -340,6 +323,13 @@ static void blkg_destroy(struct blkcg_gq *blkg) WARN_ON_ONCE(list_empty(&blkg->q_node)); WARN_ON_ONCE(hlist_unhashed(&blkg->blkcg_node)); + for (i = 0; i < BLKCG_MAX_POLS; i++) { + struct blkcg_policy *pol = blkcg_policy[i]; + + if (blkg->pd[i] && pol->pd_offline_fn) + pol->pd_offline_fn(blkg->pd[i]); + } + if (parent) { blkg_rwstat_add_aux(&parent->stat_bytes, &blkg->stat_bytes); blkg_rwstat_add_aux(&parent->stat_ios, &blkg->stat_ios); @@ -382,7 +372,6 @@ static void blkg_destroy_all(struct request_queue *q) struct blkcg *blkcg = blkg->blkcg; spin_lock(&blkcg->lock); - blkg_pd_offline(blkg); blkg_destroy(blkg); spin_unlock(&blkcg->lock); } @@ -1058,54 +1047,21 @@ static struct cftype blkcg_legacy_files[] = { * @css: css of interest * * This function is called when @css is about to go away and responsible - * for offlining all blkgs pd and killing all wbs associated with @css. - * blkgs pd offline should be done while holding both q and blkcg locks. - * As blkcg lock is nested inside q lock, this function performs reverse - * double lock dancing. + * for shooting down all blkgs associated with @css. blkgs should be + * removed while holding both q and blkcg locks. As blkcg lock is nested + * inside q lock, this function performs reverse double lock dancing. * * This is the blkcg counterpart of ioc_release_fn(). */ static void blkcg_css_offline(struct cgroup_subsys_state *css) { struct blkcg *blkcg = css_to_blkcg(css); - struct blkcg_gq *blkg; spin_lock_irq(&blkcg->lock); - hlist_for_each_entry(blkg, &blkcg->blkg_list, blkcg_node) { - struct request_queue *q = blkg->q; - - if (spin_trylock(q->queue_lock)) { - blkg_pd_offline(blkg); - spin_unlock(q->queue_lock); - } else { - spin_unlock_irq(&blkcg->lock); - cpu_relax(); - spin_lock_irq(&blkcg->lock); - } - } - - spin_unlock_irq(&blkcg->lock); - - wb_blkcg_offline(blkcg); -} - -/** - * blkcg_destroy_all_blkgs - destroy all blkgs associated with a blkcg - * @blkcg: blkcg of interest - * - * This function is called when blkcg css is about to free and responsible for - * destroying all blkgs associated with @blkcg. - * blkgs should be removed while holding both q and blkcg locks. As blkcg lock - * is nested inside q lock, this function performs reverse double lock dancing. - */ -static void blkcg_destroy_all_blkgs(struct blkcg *blkcg) -{ - spin_lock_irq(&blkcg->lock); while (!hlist_empty(&blkcg->blkg_list)) { struct blkcg_gq *blkg = hlist_entry(blkcg->blkg_list.first, - struct blkcg_gq, - blkcg_node); + struct blkcg_gq, blkcg_node); struct request_queue *q = blkg->q; if (spin_trylock(q->queue_lock)) { @@ -1117,7 +1073,10 @@ static void blkcg_destroy_all_blkgs(struct blkcg *blkcg) spin_lock_irq(&blkcg->lock); } } + spin_unlock_irq(&blkcg->lock); + + wb_blkcg_offline(blkcg); } static void blkcg_css_free(struct cgroup_subsys_state *css) @@ -1125,8 +1084,6 @@ static void blkcg_css_free(struct cgroup_subsys_state *css) struct blkcg *blkcg = css_to_blkcg(css); int i; - blkcg_destroy_all_blkgs(blkcg); - mutex_lock(&blkcg_pol_mutex); list_del(&blkcg->all_blkcgs_node); @@ -1480,11 +1437,8 @@ void blkcg_deactivate_policy(struct request_queue *q, list_for_each_entry(blkg, &q->blkg_list, q_node) { if (blkg->pd[pol->plid]) { - if (!blkg->pd[pol->plid]->offline && - pol->pd_offline_fn) { + if (pol->pd_offline_fn) pol->pd_offline_fn(blkg->pd[pol->plid]); - blkg->pd[pol->plid]->offline = true; - } pol->pd_free_fn(blkg->pd[pol->plid]); blkg->pd[pol->plid] = NULL; } diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h index 34aec30e06c7..1615cdd4c797 100644 --- a/include/linux/blk-cgroup.h +++ b/include/linux/blk-cgroup.h @@ -89,7 +89,6 @@ struct blkg_policy_data { /* the blkg and policy id this per-policy data belongs to */ struct blkcg_gq *blkg; int plid; - bool offline; }; /* -- 2.17.1