LinuxLists.cc - [patch 0/5] RFC: fault-injection capabilities

2006-08-23 11:35:39

Subject: [patch 0/5] RFC: fault-injection capabilities

This patch set provides some fault-injection capabilities.

- kmalloc failures

- alloc_pages() failures

- disk IO errors

We can see what really happens if those failures happen.

In order to enable these fault-injection capabilities:

1. Enable relevant config options (CONFIG_FAILSLAB, CONFIG_PAGE_ALLOC,
CONFIG_MAKE_REQUEST) and runtime configuration kernel module
(CONFIG_SHOULD_FAIL_KNOBS)

2. build and boot with this kernel

3. modprobe should_fail_knob

4. configure fault-injection capabilities behavior by debugfs

For example about kmalloc failures:

/debug/failslab/probability

specifies how often it should fail in percent.

/debug/failslab/interval

specifies the interval of failures.

/debug/failslab/times

specifies how many times failures may happen at most.

/debug/failslab/space

specifies the size of free space where memory can be allocated
safely in bytes.

5. see what really happens.

The idea is taken from failmalloc (http://www.nongnu.org/failmalloc/).
Andrew Morton gave me interesting suggestions.

--

2006-08-23 12:06:13

by Andi Kleen

[permalink] [raw]

Subject: Re: [patch 0/5] RFC: fault-injection capabilities

Akinobu Mita <[email protected]> writes:

> This patch set provides some fault-injection capabilities.
>
> - kmalloc failures
>
> - alloc_pages() failures
>
> - disk IO errors
>
> We can see what really happens if those failures happen.

Nice.

The SUSE kernel has a crasher module that is also quite useful for testing.
What it does basically is to always allocate/free memory and overwrite
the memory and check if the memory hasn't been changed by someone else.
Perhaps something like that could be incorporated into your framework too?

I put a copy of the suse patch in
http://www.firstfloor.org/~andi/crasher-26.diff

>
> In order to enable these fault-injection capabilities:

However I'm not sure they're too useful right now. The problem is
that they're too global and might render the system unusable. Have you
considered adding some more filters, like uid/gid to fail only (
I think that would be useful because then it would be possible
to run test suites with faults while keeping other parts of the system
functional) or maybe even a list of callers to test? e.g. only
failing for module foo would be nice.

-Andi

2006-08-23 14:19:08

by Alexey Dobriyan

[permalink] [raw]

Subject: Re: [patch 0/5] RFC: fault-injection capabilities

On Wed, Aug 23, 2006 at 08:32:43PM +0900, Akinobu Mita wrote:
> This patch set provides some fault-injection capabilities.
>
> - kmalloc failures
>
> - alloc_pages() failures
>
> - disk IO errors
>
> We can see what really happens if those failures happen.

What bugs fault-injection has already found? Ingo and Sons fixed quite
a few, _before_ lockdep was merged.

2006-08-24 18:42:02

by Valdis Klētnieks

[permalink] [raw]

Subject: Re: [patch 0/5] RFC: fault-injection capabilities

On Wed, 23 Aug 2006 20:32:43 +0900, Akinobu Mita said:

> For example about kmalloc failures:
>
> /debug/failslab/probability
>
> specifies how often it should fail in percent.

As others have noted, the *right* semantics for this is being able to inject a
1% or higher rate in the code you're interested in, while maintaining a 0
injection rate for things outside the module under test. Maybe a /debug/
failslab/address_start and address_end, and a userspace helper that peeks at a
System.map and injects the right values - then it's a simple compare of the
high/low addresses provided against the caller address.

Attachments:

(No filename) (226.00 B)