2007-09-13 13:24:52

by Eric W. Biederman

[permalink] [raw]
Subject: My position on general ``RAS'' tool support infrastructure

Pete/Piet Delaney <[email protected]> writes:

> Jason, Eric:
>
> Did you read Keith Owens suggestion on RAS tools from:

Yes.

There is a tension here between generality of support infrastructure,
maintainability of the infrastructure, simplicity of the
infrastructure and reliability of the infrastructure.

The historical linux perspective is that anything that compromises
the maintainability or the reliability of the kernel without the
tools is unacceptable.

There is also a historical perspective that using the single stepping
mode of a debugger to diagnose problems frequently leads to symptoms
being fixed and not the actual problems being fixed.

My initial proposal in this thread was that if kdb wanted to have
a hook point someplace where were not comfortable adding a hook
point it could use a break point or some of the tracing
infrastructure. Somehow that suggestion seems to have gotten lost.

On the kexec on panic path the philosophy is that the kernel is
broken and as little as possible should be relied upon. So in general
I am opposed to extra code on that path. General hooks like notifiers
in particular, because they make adding non-paranoid code much easier
and review of the code on a particular call path much harder.

>From what I can tell the philosophy of the kdb code is that the kernel
is mostly ok except for one or two little bugs so it is reasonable to
rely on lots of kernel infrastructure.

As I understand the problem the difference in philosophy and
maintenance overhead is why kexec on panic has been merged and why
it has a much larger success rate the previous crash dump
implementation like lkcd. I will not that in some sense it is a
harder approach to implement as it emphasizes the challenge of
drivers that work starting from a random hardware state, and because
it draws a clear line between the broken kernel and the recover
kernel. But those things are exactly what encourage things to work
well.

I don't mind playing well with others as long as that doesn't
compromise the implementation reliability, and maintainability.

So far it is my opinion that the current kexec on panic implementation
is insufficiently paranoid and touches the hardware and the rest of
the kernel too much. Which explains my rather strong reactions when
people suggest that we trust the broken kernel more.

I don't think this is an insolvable problem but I do think it is hard
problem that must be solved with delicacy.

I also get irritable that the last time something like this came up
I had to have a several day long conversation with someone about why
they need a patch that has already been rejected because it
compromised the reliability of the implementation only to discover
they were trying to make kdb and kexec on panic play nice together.

So if someone who is suggesting an implementation can absorb
and understand the requirements of the different groups and come
up with solutions that meet the requirements of the different projects
I think progress can be made. That as far as I know takes talent.

If we wind up with a situation where we have to continually review
unacceptable solutions the choices are either get negative about it
and reject everything, or give up and let something through. Since
I think giving up in this situation is irresponsible and likely to
make a worse kernel I am leaning very strongly towards NAK'ing
everything because I have seen so many problematic proposals that did
not look like they were on the path to something reasonable.

Eric


2007-09-18 01:41:19

by Randy Dunlap

[permalink] [raw]
Subject: Re: My position on general ``RAS'' tool support infrastructure

On Thu, 13 Sep 2007 07:21:10 -0600 Eric W. Biederman wrote:

> Pete/Piet Delaney <[email protected]> writes:
>
> > Jason, Eric:
> >
> > Did you read Keith Owens suggestion on RAS tools from:


Yes. and I re-read it.

There are several things in Keith's email that make sense:

a. all RAS tools should use a common interface
b. it's not the kernel's job to decide which RAS tool runs first


Eric makes some good points too. I'm mostly similar to Eric:
paranoid about trusting software/hardware after a panic (or oops).

So if someone wants to use multiple RAS tools on a panic event,
enabling an admin to set priorities is OK with me, but I'll only
trust the first one that is used, and even that one may have
problems. IOW, I don't see a big need to support multiple RAS
tools at one time. (speaking for myself)


> So if someone who is suggesting an implementation can absorb
> and understand the requirements of the different groups and come
> up with solutions that meet the requirements of the different projects
> I think progress can be made. That as far as I know takes talent.

Ack that.

---
~Randy

2007-09-18 04:28:18

by Vivek Goyal

[permalink] [raw]
Subject: Re: My position on general ``RAS'' tool support infrastructure

On Mon, Sep 17, 2007 at 06:38:53PM -0700, Randy Dunlap wrote:
> On Thu, 13 Sep 2007 07:21:10 -0600 Eric W. Biederman wrote:
>
> > Pete/Piet Delaney <[email protected]> writes:
> >
> > > Jason, Eric:
> > >
> > > Did you read Keith Owens suggestion on RAS tools from:
>
>
> Yes. and I re-read it.
>
> There are several things in Keith's email that make sense:
>
> a. all RAS tools should use a common interface
> b. it's not the kernel's job to decide which RAS tool runs first
>
>
> Eric makes some good points too. I'm mostly similar to Eric:
> paranoid about trusting software/hardware after a panic (or oops).
>
> So if someone wants to use multiple RAS tools on a panic event,
> enabling an admin to set priorities is OK with me, but I'll only
> trust the first one that is used, and even that one may have
> problems. IOW, I don't see a big need to support multiple RAS
> tools at one time. (speaking for myself)
>

I would be nice to have a kernel debugger co-exist with crash dumping.

I like Eric's idea of debugger putting a break point on panic(). This
would mean that rest of the post panic() actions have to be performed
by second kernel which can perform those actions much more reliably.

But this also brings in the additional requirement of passing all the
required context to second kernel. For example, in the past somebody wanted
to send a message to a remote node that sytem crashed so that standby can
take over. If the same job has to be done in second kernel, it requires all
the relavant information like remote host IP, port etc passed to the second
kernel which I think makes the job little harder. May be one can pre-configure
these parameters in user space and let the job be done either from initrd
or user space scripts in second kernel.

Thanks
Vivek