Received: by 2002:a25:ca44:0:0:0:0:0 with SMTP id a65csp891647ybg; Mon, 27 Jul 2020 02:22:13 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy9YlqnsjAZOxKVrXW/fOSm7Qz8ae4MUoYF1fsKsA4yn+r5zA+K54sG9dzeWFJE29Y508lY X-Received: by 2002:a17:906:a24a:: with SMTP id bi10mr19834203ejb.227.1595841733374; Mon, 27 Jul 2020 02:22:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1595841733; cv=none; d=google.com; s=arc-20160816; b=cghx7+PeJZkTHIz42wwEAUtidwiPoKvWPSDgc7JCcjp8rrPaK0sSuaWCWBemR/pYH6 6Mzvv3l46YPB6k36XevK7JgELUGW0zqrAD14LbusRk9UHfQnsxlCKEhMi1WjuuYT7+QH 3a03W3+/xImPVvW3tAr0QwdTe9rF2ZeLA6UEmk9TP7vkubI3CCP/8qURTW12J8C2BKgF vJoFzGDWqRQyxeHRV+gazy7z7YVbFguS1uqQevUbNDBXu+2ydaklGRV/d1qXF6zY6sgi BJngs68Xfjn0B7vue8ntYmWUYTD7WWa5qGu/PkQRliKsazjWKF646fQwS74lMWV7vKF7 hrIQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:message-id :user-agent:references:in-reply-to:subject:cc:to:from:date :mime-version:dkim-signature; bh=oE5bsJtqpXv5ohjSvoncB9kAagQg4UUv7dXdwAH9yOY=; b=dcmeO3YJOsmALWC/fPRND42y9H6Do6pOXCRt14hkujbOiGa388Pkmx+vxRUSNVOhOo 0yOHPzlnKdbU863yCH/rxZZG8fU+PT+Z/DsW8SCMKta0ohh2XNgztzuk1nx7ldcMEdJ/ 44CS+K0fM7vkmQShV42MPfGUryF1eKLwsqj5WCkjlGsteGijvJYzv9MqKVxi9HsLjuuM n2j3vurtB/4O23dNEthHzZv0OVJHO1H5LnkWGb45wzXOG1z+iFncm6hkBMmdUnB9hySl Hldxz/4BUUDFSvdt8zrPqj8+4lblvMxUW0/t609G6NSV9uGAw86o4kO+I87nfzgjDgMX Vv+A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@natalenko.name header.s=dkim-20170712 header.b=i5D7zyV6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=natalenko.name Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id k10si5856529ejr.155.2020.07.27.02.21.50; Mon, 27 Jul 2020 02:22:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@natalenko.name header.s=dkim-20170712 header.b=i5D7zyV6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=natalenko.name Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726706AbgG0JRn (ORCPT + 99 others); Mon, 27 Jul 2020 05:17:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46052 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726228AbgG0JRn (ORCPT ); Mon, 27 Jul 2020 05:17:43 -0400 Received: from vulcan.natalenko.name (vulcan.natalenko.name [IPv6:2001:19f0:6c00:8846:5400:ff:fe0c:dfa0]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 18799C061794 for ; Mon, 27 Jul 2020 02:17:43 -0700 (PDT) Received: from mail.natalenko.name (vulcan.natalenko.name [IPv6:fe80::5400:ff:fe0c:dfa0]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by vulcan.natalenko.name (Postfix) with ESMTPSA id 12CD77DB276; Mon, 27 Jul 2020 11:17:36 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=natalenko.name; s=dkim-20170712; t=1595841456; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=oE5bsJtqpXv5ohjSvoncB9kAagQg4UUv7dXdwAH9yOY=; b=i5D7zyV6nYk9CoegNS1yttd4Dshy2BicvyyuR6N8oJLn26sNyExEVB49G/7wZrPMNDhzd5 V+/bL17nZOcCv97h7nrsYNDX3B1XLbvdyfI2H4J5rQ9gk1UDku2Ie68jl0X7iEBj8D4BdQ DCM4mRCQottk8ceOHQIdOu1FYojkHG8= MIME-Version: 1.0 Date: Mon, 27 Jul 2020 11:17:35 +0200 From: Oleksandr Natalenko To: Dmitry Monakhov Cc: linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, axboe@kernel.dk, paolo.valente@linaro.org Subject: Re: [PATCH] block: bfq fix blkio cgroup leakage v3 In-Reply-To: <20200727080107.6431-1-dmtrmonakhov@yandex-team.ru> References: <6422992afade0015d817a65c124e0c75@natalenko.name> <20200727080107.6431-1-dmtrmonakhov@yandex-team.ru> User-Agent: Roundcube Webmail/1.4.7 Message-ID: <1bfaa79dd9d336b0943c863f918737b2@natalenko.name> X-Sender: oleksandr@natalenko.name Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 27.07.2020 10:01, Dmitry Monakhov wrote: > commit db37a34c563b ("block, bfq: get a ref to a group when adding it > to a service tree") > introduce leak forbfq_group and blkcg_gq objects because of get/put > imbalance. See trace balow: > -> blkg_alloc > -> bfq_pq_alloc > -> bfqg_get (+1) > ->bfq_activate_bfqq > ->bfq_activate_requeue_entity > -> __bfq_activate_entity > ->bfq_get_entity > ->bfqg_and_blkg_get (+1) <==== : Note1 > ->bfq_del_bfqq_busy > ->bfq_deactivate_entity+0x53/0xc0 [bfq] > ->__bfq_deactivate_entity+0x1b8/0x210 [bfq] > -> bfq_forget_entity(is_in_service = true) > entity->on_st_or_in_serv = false <=== :Note2 > if (is_in_service) > return; ==> do not touch reference > -> blkcg_css_offline > -> blkcg_destroy_blkgs > -> blkg_destroy > -> bfq_pd_offline > -> __bfq_deactivate_entity > if (!entity->on_st_or_in_serv) /* true, because (Note2) > return false; > -> bfq_pd_free > -> bfqg_put() (-1, byt bfqg->ref == 2) because of (Note2) > So bfq_group and blkcg_gq will leak forever, see test-case below. > > We should drop group reference once it finaly removed from service > inside __bfq_bfqd_reset_in_service, as we do with queue entities. > > ##TESTCASE_BEGIN: > #!/bin/bash > > max_iters=${1:-100} > #prep cgroup mounts > mount -t tmpfs cgroup_root /sys/fs/cgroup > mkdir /sys/fs/cgroup/blkio > mount -t cgroup -o blkio none /sys/fs/cgroup/blkio > > # Prepare blkdev > grep blkio /proc/cgroups > truncate -s 1M img > losetup /dev/loop0 img > echo bfq > /sys/block/loop0/queue/scheduler > > grep blkio /proc/cgroups > for ((i=0;i do > mkdir -p /sys/fs/cgroup/blkio/a > echo 0 > /sys/fs/cgroup/blkio/a/cgroup.procs > dd if=/dev/loop0 bs=4k count=1 of=/dev/null iflag=direct 2> > /dev/null > echo 0 > /sys/fs/cgroup/blkio/cgroup.procs > rmdir /sys/fs/cgroup/blkio/a > grep blkio /proc/cgroups > done > ##TESTCASE_END: > > changes since v2: > - use safe iteration macro to prevent freed object dereference. > > Signed-off-by: Dmitry Monakhov > --- > block/bfq-wf2q.c | 24 ++++++++++++++---------- > 1 file changed, 14 insertions(+), 10 deletions(-) > > diff --git a/block/bfq-wf2q.c b/block/bfq-wf2q.c > index 8113138..13b417a 100644 > --- a/block/bfq-wf2q.c > +++ b/block/bfq-wf2q.c > @@ -635,14 +635,10 @@ static void bfq_idle_insert(struct > bfq_service_tree *st, > * @entity: the entity being removed. > * @is_in_service: true if entity is currently the in-service entity. > * > - * Forget everything about @entity. In addition, if entity represents > - * a queue, and the latter is not in service, then release the service > - * reference to the queue (the one taken through bfq_get_entity). In > - * fact, in this case, there is really no more service reference to > - * the queue, as the latter is also outside any service tree. If, > - * instead, the queue is in service, then __bfq_bfqd_reset_in_service > - * will take care of putting the reference when the queue finally > - * stops being served. > + * Forget everything about @entity. If entity is not in service, then > release > + * the service reference to the entity (the one taken through > bfq_get_entity). > + * If the entity is in service, then __bfq_bfqd_reset_in_service will > take care > + * of putting the reference when the entity finally stops being > served. > */ > static void bfq_forget_entity(struct bfq_service_tree *st, > struct bfq_entity *entity, > @@ -1615,6 +1611,7 @@ bool __bfq_bfqd_reset_in_service(struct bfq_data > *bfqd) > struct bfq_queue *in_serv_bfqq = bfqd->in_service_queue; > struct bfq_entity *in_serv_entity = &in_serv_bfqq->entity; > struct bfq_entity *entity = in_serv_entity; > + struct bfq_entity *parent = NULL; > > bfq_clear_bfqq_wait_request(in_serv_bfqq); > hrtimer_try_to_cancel(&bfqd->idle_slice_timer); > @@ -1626,9 +1623,16 @@ bool __bfq_bfqd_reset_in_service(struct bfq_data > *bfqd) > * execute the final step: reset in_service_entity along the > * path from entity to the root. > */ > - for_each_entity(entity) > + for_each_entity_safe(entity, parent) { > entity->sched_data->in_service_entity = NULL; > - > + /* > + * Release bfq_groups reference if it was not released in > + * bfq_forget_entity, which was taken in bfq_get_entity. > + */ > + if (!bfq_entity_to_bfqq(entity) && !entity->on_st_or_in_serv) > + bfqg_and_blkg_put(container_of(entity, struct bfq_group, > + entity)); > + } > /* > * in_serv_entity is no longer in service, so, if it is in no > * service tree either, then release the service reference to Reportedly, this one crashes too [1], and this happens even earlier than with v2. [1] http://pix.academ.info/images/img/2020/07/27/91f656514707728730b0b67f8c9f4a04.jpg -- Oleksandr Natalenko (post-factum)