LinuxLists.cc - [RFC] Block IO Controller V2

2009-11-12 23:43:59

Subject: [RFC] Block IO Controller V2

Hi Jens,

This is V2 of the Block IO controller patches on top of "for-2.6.33" branch
of block tree.

A consolidated patch can be found here:

http://people.redhat.com/vgoyal/io-controller/blkio-controller/blkio-controller-v2.patch

Changes from V1:

- Rebased the patches for "for-2.6.33" branch.
- Currently dropped the support for priority class of groups. For the time
being only BE class groups are supported.

After the discussions at IO minisummit at Tokyo, Japan, it was agreed that
one single IO control policy at either leaf nodes or at higher level nodes
does not meet all the requirements and we need something so that we have
the capability to support more than one IO control policy (like proportional
weight division and max bandwidth control) and also have capability to
implement some of these policies at higher level logical devices.

It was agreed that CFQ is the right place to implement time based proportional
weight division policy. Other policies like max bandwidth control/throttling
will make more sense at higher level logical devices.

This patch introduces blkio cgroup controller. It provides the management
interface for the block IO control. The idea is that keep the interface
common and in the background we should be able to switch policies based on
user options. Hence user can control the IO throughout the IO stack with
a single cgroup interface.

Apart from blkio cgroup interface, this patchset also modifies CFQ to implement
time based proportional weight division of disk. CFQ already does it in flat
mode. It has been modified to do group IO scheduling also.

IO control is a huge problem and the moment we start addressing all the
issues in one patchset, it bloats to unmanageable proportions and then nothing
gets inside the kernel. So at io mini summit we agreed that lets take small
steps and once a piece of code is inside the kernel and stablized, take the
next step. So this is the first step.

Some parts of the code are based on BFQ patches posted by Paolo and Fabio.

Your feedback is welcome.

TODO
====
- Support async IO control (buffered writes).

Buffered writes is a beast and requires changes at many a places to solve the
problem and patchset becomes huge. Hence first we plan to support only sync
IO in control then work on async IO too.

Some of the work items identified are.

- Per memory cgroup dirty ratio
- Possibly modification of writeback to force writeback from a
particular cgroup.
- Implement IO tracking support so that a bio can be mapped to a cgroup.
- Per group request descriptor infrastructure in block layer.
- At CFQ level, implement per cfq_group async queues.

In this patchset, all the async IO goes in system wide queues and there are
no per group async queues. That means we will see service differentiation
only for sync IO only. Async IO willl be handled later.

- Support for higher level policies like max BW controller.
- Support groups of RT class also.

Thanks
Vivek

Documentation/cgroups/blkio-controller.txt | 100 +++
block/Kconfig | 22 +
block/Kconfig.iosched | 17 +
block/Makefile | 1 +
block/blk-cgroup.c | 312 ++++++++++
block/blk-cgroup.h | 90 +++
block/cfq-iosched.c | 901 ++++++++++++++++++++++++----
include/linux/cgroup_subsys.h | 6 +
include/linux/iocontext.h | 4 +
9 files changed, 1346 insertions(+), 107 deletions(-)

2009-11-13 01:41:53

Subject: [RFC] Block IO Controller V2

Subject: [PATCH 01/16] blkio: Documentation

Subject: [PATCH 02/16] blkio: Introduce the notion of cfq groups

Subject: [PATCH 03/16] blkio: Keep queue on service tree until we expire it

Subject: [PATCH 04/16] blkio: Introduce the root service tree for cfq groups

Subject: [PATCH 05/16] blkio: Implement per cfq group latency target and busy queue avg

Subject: [PATCH 06/16] blkio: Introduce blkio controller cgroup interface

Subject: [PATCH 07/16] blkio: Introduce per cfq group weights and vdisktime calculations

Subject: [PATCH 08/16] blkio: Group time used accounting and workload context save restore

Subject: [PATCH 09/16] blkio: Dynamic cfq group creation based on cgroup tasks belongs to

Subject: [PATCH 10/16] blkio: Take care of cgroup deletion and cfq group reference counting

Subject: [PATCH 11/16] blkio: Some debugging aids for CFQ

Subject: [PATCH 12/16] blkio: Export disk time and sectors used by a group to user space

Subject: [PATCH 13/16] blkio: Provide some isolation between groups

Subject: [PATCH 14/16] blkio: Idle on a group for some time on rotational media

Subject: [PATCH 15/16] blkio: Drop the reference to queue once the task changes cgroup

Subject: [PATCH 16/16] blkio: Propagate cgroup weight updation to cfq groups

Subject: Re: [PATCH 03/16] blkio: Keep queue on service tree until we expire it

Subject: Re: [PATCH 05/16] blkio: Implement per cfq group latency target and busy queue avg

Subject: Re: [PATCH 03/16] blkio: Keep queue on service tree until we expire it

Subject: Re: [PATCH 01/16] blkio: Documentation

Subject: Re: [PATCH 14/16] blkio: Idle on a group for some time on rotational media

Subject: Re: [PATCH 03/16] blkio: Keep queue on service tree until we expire it

Subject: Re: [PATCH 05/16] blkio: Implement per cfq group latency target and busy queue avg

Subject: Re: [PATCH 01/16] blkio: Documentation

Subject: Re: [PATCH 14/16] blkio: Idle on a group for some time on rotational media

Subject: Re: [PATCH 05/16] blkio: Implement per cfq group latency target and busy queue avg

Subject: Re: [PATCH 05/16] blkio: Implement per cfq group latency target and busy queue avg

Subject: Re: [PATCH 03/16] blkio: Keep queue on service tree until we expire it

Subject: Re: [PATCH 05/16] blkio: Implement per cfq group latency target and busy queue avg

Subject: Re: [PATCH 05/16] blkio: Implement per cfq group latency target and busy queue avg