Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755278Ab0DEPMv (ORCPT ); Mon, 5 Apr 2010 11:12:51 -0400 Received: from mx1.redhat.com ([209.132.183.28]:63008 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755192Ab0DEPMp (ORCPT ); Mon, 5 Apr 2010 11:12:45 -0400 Date: Mon, 5 Apr 2010 11:12:38 -0400 From: Vivek Goyal To: Divyesh Shah Cc: jens.axboe@oracle.com, linux-kernel@vger.kernel.org, nauman@google.com, ctalbott@google.com Subject: Re: [PATCH 3/3] blkio: Increment the blkio cgroup stats for real now Message-ID: <20100405151237.GD876@redhat.com> References: <20100401215541.2843.79107.stgit@austin.mtv.corp.google.com> <20100401220129.2843.36193.stgit@austin.mtv.corp.google.com> <20100402191000.GD3516@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.19 (2009-01-05) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4365 Lines: 105 On Fri, Apr 02, 2010 at 04:36:34PM -0700, Divyesh Shah wrote: > On Fri, Apr 2, 2010 at 12:10 PM, Vivek Goyal wrote: > > On Thu, Apr 01, 2010 at 03:01:41PM -0700, Divyesh Shah wrote: > >> We also add start_time_ns and io_start_time_ns fields to struct request > >> here to record the time when a request is created and when it is > >> dispatched to device. We use ns uints here as ms and jiffies are > >> not very useful for non-rotational media. > >> > >> Signed-off-by: Divyesh Shah > >> --- > >> > >> ?block/blk-cgroup.c ? ? | ? 60 ++++++++++++++++++++++++++++++++++++++++++++++-- > >> ?block/blk-cgroup.h ? ? | ? 14 +++++++++-- > >> ?block/blk-core.c ? ? ? | ? ?6 +++-- > >> ?block/cfq-iosched.c ? ?| ? ?4 ++- > >> ?include/linux/blkdev.h | ? 20 +++++++++++++++- > >> ?5 files changed, 95 insertions(+), 9 deletions(-) > >> > >> diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c > >> index ad6843f..9af7257 100644 > >> --- a/block/blk-cgroup.c > >> +++ b/block/blk-cgroup.c > >> @@ -15,6 +15,7 @@ > >> ?#include > >> ?#include > >> ?#include > >> +#include > >> ?#include "blk-cgroup.h" > >> > >> ?static DEFINE_SPINLOCK(blkio_list_lock); > >> @@ -55,6 +56,26 @@ struct blkio_cgroup *cgroup_to_blkio_cgroup(struct cgroup *cgroup) > >> ?} > >> ?EXPORT_SYMBOL_GPL(cgroup_to_blkio_cgroup); > >> > >> +/* > >> + * Add to the appropriate stat variable depending on the request type. > >> + * This should be called with the blkg->stats_lock held. > >> + */ > >> +void io_add_stat(uint64_t *stat, uint64_t add, unsigned int flags) > >> +{ > >> + ? ? if (flags & REQ_RW) > >> + ? ? ? ? ? ? stat[IO_WRITE] += add; > >> + ? ? else > >> + ? ? ? ? ? ? stat[IO_READ] += add; > >> + ? ? /* > >> + ? ? ?* Everywhere in the block layer, an IO is treated as sync if it is a > >> + ? ? ?* read or a SYNC write. We follow the same norm. > >> + ? ? ?*/ > >> + ? ? if (!(flags & REQ_RW) || flags & REQ_RW_SYNC) > >> + ? ? ? ? ? ? stat[IO_SYNC] += add; > >> + ? ? else > >> + ? ? ? ? ? ? stat[IO_ASYNC] += add; > >> +} > >> + > > > > Hi Divyesh, > > > > Can we have any request based information limited to cfq and not put that > > in blkio-cgroup. The reason being that I am expecting that some kind of > > max bw policy interface will not necessarily be implemented at CFQ > > level. We might have to implement it at higher level so that it can > > work with all dm/md devices. If that's the case, then it might very well > > be either a bio based interface also. > > > > So just keeping that possibility in mind, can we keep blk-cgroup as > > generic as possible and not necessarily make it dependent on "struct > > request". > > Ok. I do understand the motivation for keeping the request related > info out of blk-cgroup. Everything except the rq->cmd_flags can be > easily done away with. Maybe I'll need to have CFQ send the sync and > direction bits as args to the functions that need it. Not ideal coz > we'll have functions with many args but I guess its not that bad too. > > > > > If you implement, two dimensional arrays for stats then we can have > > following function. > > > > blkio_add_stat(enum stat_type var enum stat_sub_type var_type, u64 val) > > I would want to avoid calls like these from CFQ into the blkcg code > because many CFQ events trigger update for multiple stats (you'll see > more with stats in later patchsets) and doing these calls > independently for each stat would mean that we would also need to grab > the stats_lock multiple times when we could've avoided that. I understand the need to club the updates and reduce the need of taking stats_lock multiple times. I was thinking of any of following. - Get rid of reset interface per cgroup. Rely on changing ioscheduler on request queue and that will get rid of stats_lock entirely. - Can we use a function blkio_add_stat() with variable number of arguments so that more than one stat can be updated in a single call? If you have other ideas to implement it without assuming "struct rq" in blk-cgroup, please do that. Thanks Vivek -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/