On Wed, 2006-08-30 at 16:40 -0400, Vivek Goyal wrote:
> On Wed, Aug 30, 2006 at 12:35:21PM -0700, Piet Delaney wrote:
> > >
> > > Simple question -- and to be quite honest with you -- I don't
> > > understand why you wouldn't want to simply use gdb alone
> > > in this case?
> >
> > I don't see any reason for core file not to be read correctly by
> > gdb. It's convenient to use gdb directly sometimes, for example
> > while using the ddd GUI.
> >
>
> You can run gdb to open core files as of today but the debugging
> capability will be limited. For ex. kernel core headers have the info
> of linearly mapped region only and they don't contain the virt address
> info of non-linearly mapped regions. So one can not debug the non-linearly
> mapped regions like modules.
Amit's modified gdb might help for that problem. I haven't used
it but it allows gdb to load debug information about modules. You
can also use a script Amit wrote to explicitly load module info
into stock gdb; that also might work with kernel core files.
>
> > kgdb isn't having any problems with kernel threads back traces.
> > The kernel objects are tweaked with dwarf code, but I see no
> > problem with using the same paradigm with crash. Works great.
> >
>
> Can you give some more details on what do you mean by kernel objects
> are tweaked with dwarf code.
Attached is the cfi_annotations.patch patch from the kgdb-2.6.16 patch
which is part of the kgdb patch series. I believe George Anzinger used
a similar dwarf patch in the 2.6 mm series patches that Andrew provided.
I think Tom Rini wrote both of them.
>
> > I'd prefer to have crash and ddd+gdb operate on kernel core files.
> >
>
> You can already do that. Its just a matter of figuring out how to
> get good backtraces both with "crash" as well as "gdb".
I think Tom Rini's cfi_annotations could be a big part of that solution.
>
> > Even better it would be nice to be able to simulate execution on
> > a stack of a core file to be able to re-execute code that caused
> > the crash. I frequently found it convenient after a panic to move
> > the pc to the end of panic, and continue back up the stack to a
> > break point at the system call. Then I'd use the GUI to move the
> > pc to before the execution of the system call and execute it again
> > and watch how the return value was derived that caused the panic.
> >
> > I expect that if you run a kgdb kernel, including the drarf code,
> > that gdb will have no problem with core dumps. It's convenient to
> > have kgdb configured in the kernel and have the option to continue
> > analysis later with gdb/crash.
> >
>
> Is kgdb mainline? I think some time back Andrew had dropped the patches
> from -mm too.
Yes, I think he had a number of issues with the kgdb patch but I can't
recall reading exactly what they are. One I believe is that the kgdb
patch should be completly non-invasive if not configured in. Currently
some files that are patched don't have #ifdef CONFIG_KGDB in them. I
noticed one last night while checking in some code.
I'd like to put those #ifdef's back in and make it part of the std
distribution. As I recall George Anzinger's patch had absolutely no
impact on the kernel if not configured in. Seems very important to me.
> I don't know if distros carry kgdb or not? So not sure
> for how many people will it be helpful to enable kgdb and then take
> core dumps for better back traces.
More for larger servers like a Sun NUMA system. I'd find it convenient
to be able to go back a look at a crash of something that I looked at
previously. Might be good for bug reports to have references to core
files backing up a bug fix.
>
> I don't know much about tweaking objects with dwarf code but got a
> general question. Why can't it be an independent patch in kernel
> independent of kgdb. (If it helps in getting better backtraces.)
Exactly. Locally I just checked in code that I expect will be useful for
kgdb or kdump. Stuff like compiling the kernel -O0 and converting
static inline functions to inline. Code to provide dwarf info and
save registers during a panic seem to also qualify.
My preference is for kgdb, like kexec, to become part of the
mainstream kernel as a configurable component. Perhaps Andrew
could enumerate his issues. It would make cooperation between
kgdb and crash a bit easier and make kernel debugging a lot
easier for the masses. Recent kgdb patches seem to be getting
much better.
-piet
>
> Thanks
> Vivek
>
--
Piet Delaney
BlueLane Teck
W: (408) 200-5256; [email protected]
H: (408) 243-8872; [email protected]
On Wed, 30 Aug 2006 14:41:32 -0700
Piet Delaney <[email protected]> wrote:
> My preference is for kgdb, like kexec, to become part of the
> mainstream kernel as a configurable component.
Me too. And I expect I could talk Linus into it if a) it works well on a
transport other-than-rs232 and b) the patches are nice and clean.
> Perhaps Andrew
> could enumerate his issues.
a) and b) above. Plus: I'd want to see a maintainance person or team who
respond promptly to email and who remain reasonably engaged with what's
going on in the mainline kernel. Because if problems crop up (and they
will), I don't want to have to be the bunny who has to worry about them...
On Wed, 30 Aug 2006 14:41:32 -0700 Piet Delaney wrote:
> On Wed, 2006-08-30 at 16:40 -0400, Vivek Goyal wrote:
> > On Wed, Aug 30, 2006 at 12:35:21PM -0700, Piet Delaney wrote:
> > > >
> > > > Simple question -- and to be quite honest with you -- I don't
> > > > understand why you wouldn't want to simply use gdb alone
> > > > in this case?
> > >
> > > I don't see any reason for core file not to be read correctly by
> > > gdb. It's convenient to use gdb directly sometimes, for example
> > > while using the ddd GUI.
> > >
> >
> > You can run gdb to open core files as of today but the debugging
> > capability will be limited. For ex. kernel core headers have the info
> > of linearly mapped region only and they don't contain the virt address
> > info of non-linearly mapped regions. So one can not debug the non-linearly
> > mapped regions like modules.
>
> Amit's modified gdb might help for that problem. I haven't used
> it but it allows gdb to load debug information about modules. You
> can also use a script Amit wrote to explicitly load module info
> into stock gdb; that also might work with kernel core files.
>
> >
> > > kgdb isn't having any problems with kernel threads back traces.
> > > The kernel objects are tweaked with dwarf code, but I see no
> > > problem with using the same paradigm with crash. Works great.
> > >
> >
> > Can you give some more details on what do you mean by kernel objects
> > are tweaked with dwarf code.
>
> Attached is the cfi_annotations.patch patch from the kgdb-2.6.16 patch
> which is part of the kgdb patch series. I believe George Anzinger used
> a similar dwarf patch in the 2.6 mm series patches that Andrew provided.
> I think Tom Rini wrote both of them.
ENOPATCH
---
~Randy
On Wed, 2006-08-30 at 14:53 -0700, Randy.Dunlap wrote:
> On Wed, 30 Aug 2006 14:41:32 -0700 Piet Delaney wrote:
>
> > On Wed, 2006-08-30 at 16:40 -0400, Vivek Goyal wrote:
> > > On Wed, Aug 30, 2006 at 12:35:21PM -0700, Piet Delaney wrote:
> > > > >
> > > > > Simple question -- and to be quite honest with you -- I don't
> > > > > understand why you wouldn't want to simply use gdb alone
> > > > > in this case?
> > > >
> > > > I don't see any reason for core file not to be read correctly by
> > > > gdb. It's convenient to use gdb directly sometimes, for example
> > > > while using the ddd GUI.
> > > >
> > >
> > > You can run gdb to open core files as of today but the debugging
> > > capability will be limited. For ex. kernel core headers have the info
> > > of linearly mapped region only and they don't contain the virt address
> > > info of non-linearly mapped regions. So one can not debug the non-linearly
> > > mapped regions like modules.
> >
> > Amit's modified gdb might help for that problem. I haven't used
> > it but it allows gdb to load debug information about modules. You
> > can also use a script Amit wrote to explicitly load module info
> > into stock gdb; that also might work with kernel core files.
> >
> > >
> > > > kgdb isn't having any problems with kernel threads back traces.
> > > > The kernel objects are tweaked with dwarf code, but I see no
> > > > problem with using the same paradigm with crash. Works great.
> > > >
> > >
> > > Can you give some more details on what do you mean by kernel objects
> > > are tweaked with dwarf code.
> >
> > Attached is the cfi_annotations.patch patch from the kgdb-2.6.16 patch
> > which is part of the kgdb patch series. I believe George Anzinger used
> > a similar dwarf patch in the 2.6 mm series patches that Andrew provided.
> > I think Tom Rini wrote both of them.
>
> ENOPATCH
Opps.
-piet
>
> ---
> ~Randy
--
Piet Delaney
BlueLane Teck
W: (408) 200-5256; [email protected]
H: (408) 243-8872; [email protected]
On Wed, 30 Aug 2006 14:48:22 -0700
Andrew Morton <[email protected]> wrote:
> Plus: I'd want to see a maintainance person or team who
> respond promptly to email and who remain reasonably engaged with what's
> going on in the mainline kernel. Because if problems crop up (and they
> will), I don't want to have to be the bunny who has to worry about them...
umm, clarification needed here.
No criticism of the present maintainers intended! Last time I grabbed the
kgdb patches from sf.net they applied nicely, worked quite reliably (much
better than the old ones I'd been trying to sustain) and had been
tremendously cleaned up.
But if we're to move this work from sf.net to kernel.org, the kgdb
maintainers' workload, email load, turnaround time requirements,
bug-difficulty and everything else will go up quite a lot, at least short-term.
If they don't want to volunteer take that on (perfectly legit and sane) then
things should stay as they are.
(otoh, a merge would decrease their patch-maintenance load, and would
increase the number of people who fix things for them, and might attract new
maintainers).
It's a big step.
On Wed, 2006-08-30 at 15:57 -0700, Andrew Morton wrote:
> On Wed, 30 Aug 2006 14:48:22 -0700
> Andrew Morton <[email protected]> wrote:
>
> > Plus: I'd want to see a maintainance person or team who
> > respond promptly to email and who remain reasonably engaged with what's
> > going on in the mainline kernel. Because if problems crop up (and they
> > will), I don't want to have to be the bunny who has to worry about them...
>
> umm, clarification needed here.
>
> No criticism of the present maintainers intended! Last time I grabbed the
> kgdb patches from sf.net they applied nicely, worked quite reliably (much
> better than the old ones I'd been trying to sustain) and had been
> tremendously cleaned up.
So why did you stop including them in the mm patch?
I recall your quality issue and Tom was all in favor
of resolving them. Was it too much work cleaning up the
patches to meet your needs that lead to the patch being
dropped from the mm series?
kgdb over ethernet is working great, and it looks like there
is plenty of support on the SF mailing list.
>
> But if we're to move this work from sf.net to kernel.org, the kgdb
> maintainers' workload, email load, turnaround time requirements,
> bug-difficulty and everything else will go up quite a lot, at least short-term.
> If they don't want to volunteer take that on (perfectly legit and sane) then
> things should stay as they are.
I've only read a positive point of view on resolving the issues on the
mailing list.
>
> (otoh, a merge would decrease their patch-maintenance load, and would
> increase the number of people who fix things for them, and might attract new
> maintainers).
I agree, it would likely attract many more maintainers and be easier
to maintain with git than patches.
>
> It's a big step.
How about a concrete list of patch quality issues that the group
can address to allow your weekly addition to the mm patch as a
set toward eventually integration.
Wouldn't getting kgdb back into the mm patch series be a reasonable
first step eventual maintenance in kernel.org? I hadn't even noticed
that it had been dropped until today's discussion in the crash mailing
list.
-piet
--
Piet Delaney
BlueLane Teck
W: (408) 200-5256; [email protected]
H: (408) 243-8872; [email protected]
On Wed, 30 Aug 2006 19:42:32 -0700
Piet Delaney <[email protected]> wrote:
> On Wed, 2006-08-30 at 15:57 -0700, Andrew Morton wrote:
> > On Wed, 30 Aug 2006 14:48:22 -0700
> > Andrew Morton <[email protected]> wrote:
> >
> > > Plus: I'd want to see a maintainance person or team who
> > > respond promptly to email and who remain reasonably engaged with what's
> > > going on in the mainline kernel. Because if problems crop up (and they
> > > will), I don't want to have to be the bunny who has to worry about them...
> >
> > umm, clarification needed here.
> >
> > No criticism of the present maintainers intended! Last time I grabbed the
> > kgdb patches from sf.net they applied nicely, worked quite reliably (much
> > better than the old ones I'd been trying to sustain) and had been
> > tremendously cleaned up.
>
> So why did you stop including them in the mm patch?
Some change in 2.6.17-pre caused it to all stop working.
> I recall your quality issue and Tom was all in favor
> of resolving them. Was it too much work cleaning up the
> patches to meet your needs that lead to the patch being
> dropped from the mm series?
It all seems reasonably clean now, but I haven't looked closely (nor have I
had to)
> kgdb over ethernet is working great, and it looks like there
> is plenty of support on the SF mailing list.
good.
> >
> > It's a big step.
>
> How about a concrete list of patch quality issues that the group
> can address to allow your weekly addition to the mm patch as a
> set toward eventually integration.
>From whom? me?
> Wouldn't getting kgdb back into the mm patch series be a reasonable
> first step eventual maintenance in kernel.org?
Is on my todo list somewhere.
Piet Delaney <[email protected]> writes:
> >
> > ENOPATCH
>
> Opps.
What an ugly patch!
But it should be totally obsolete with the unwinder work Jan and me have been
doing recently which does this all properly. .18 isn't quite there
yet in all cases, but .19 will be hopefully.
-Andi
On Thu, Aug 31, 2006 at 04:07:15PM +0200, Andi Kleen wrote:
> Piet Delaney <[email protected]> writes:
> > >
> > > ENOPATCH
> >
> > Opps.
>
> What an ugly patch!
>
> But it should be totally obsolete with the unwinder work Jan and me have been
> doing recently which does this all properly. .18 isn't quite there
> yet in all cases, but .19 will be hopefully.
Indeed. But quite functional. Have you guys been doing i386 as well?
This kind of thing was needed to convince gdb when it really was time to
stop trying unwind in a few cases, but looks quite bad on x86_64/i386.
Thankfully getting it to stop on ARM was pretty easy (but it wasn't
full/true annotations).
--
Tom Rini
On Thursday 31 August 2006 16:20, Tom Rini wrote:
> On Thu, Aug 31, 2006 at 04:07:15PM +0200, Andi Kleen wrote:
> > Piet Delaney <[email protected]> writes:
> > > >
> > > > ENOPATCH
> > >
> > > Opps.
> >
> > What an ugly patch!
> >
> > But it should be totally obsolete with the unwinder work Jan and me have been
> > doing recently which does this all properly. .18 isn't quite there
> > yet in all cases, but .19 will be hopefully.
>
> Indeed. But quite functional. Have you guys been doing i386 as well?
Yes.
-Andi
P.S.: Please don't include member only lists in linux-kernel cc lists. Dropped.
On Thu, 2006-08-31 at 07:20 -0700, Tom Rini wrote:
> On Thu, Aug 31, 2006 at 04:07:15PM +0200, Andi Kleen wrote:
> > Piet Delaney <[email protected]> writes:
> > > >
> > > > ENOPATCH
> > >
> > > Opps.
> >
> > What an ugly patch!
> >
> > But it should be totally obsolete with the unwinder work Jan and me have been
> > doing recently which does this all properly. .18 isn't quite there
> > yet in all cases, but .19 will be hopefully.
>
> Indeed. But quite functional. Have you guys been doing i386 as well?
> This kind of thing was needed to convince gdb when it really was time to
> stop trying unwind in a few cases, but looks quite bad on x86_64/i386.
> Thankfully getting it to stop on ARM was pretty easy (but it wasn't
> full/true annotations).
I wonder if we are killing a fly with a sledgehammer. On SunOS 4.1.4 I
just patched the top of stack with a NULL pointer. With SPARC the kernel
uses different registers than the user and don't recall their being a
problem with a NULL pointer being at the top of the kernel stack. Is
there a problem with the i386 architecture with the top of the kernel
stack having a NULL pointer? My guess is that it's needed to return
to the right place in user space.
-piet
>
--
Piet Delaney
BlueLane Teck
W: (408) 200-5256; [email protected]
H: (408) 243-8872; [email protected]
On Wed, 2006-08-30 at 20:00 -0700, Andrew Morton wrote:
> On Wed, 30 Aug 2006 19:42:32 -0700
> Piet Delaney <[email protected]> wrote:
>
> > On Wed, 2006-08-30 at 15:57 -0700, Andrew Morton wrote:
> > > On Wed, 30 Aug 2006 14:48:22 -0700
> > > Andrew Morton <[email protected]> wrote:
> > >
> > > > Plus: I'd want to see a maintainance person or team who
> > > > respond promptly to email and who remain reasonably engaged with what's
> > > > going on in the mainline kernel. Because if problems crop up (and they
> > > > will), I don't want to have to be the bunny who has to worry about them...
> > >
> > > umm, clarification needed here.
> > >
> > > No criticism of the present maintainers intended! Last time I grabbed the
> > > kgdb patches from sf.net they applied nicely, worked quite reliably (much
> > > better than the old ones I'd been trying to sustain) and had been
> > > tremendously cleaned up.
> >
> > So why did you stop including them in the mm patch?
>
> Some change in 2.6.17-pre caused it to all stop working.
>
> > I recall your quality issue and Tom was all in favor
> > of resolving them. Was it too much work cleaning up the
> > patches to meet your needs that lead to the patch being
> > dropped from the mm series?
>
> It all seems reasonably clean now, but I haven't looked closely (nor have I
> had to)
Any suggestions on how to progress?
>
> > kgdb over ethernet is working great, and it looks like there
> > is plenty of support on the SF mailing list.
>
> good.
>
> > >
> > > It's a big step.
> >
> > How about a concrete list of patch quality issues that the group
> > can address to allow your weekly addition to the mm patch as a
> > set toward eventually integration.
>
> >From whom? me?
>
> > Wouldn't getting kgdb back into the mm patch series be a reasonable
> > first step eventual maintenance in kernel.org?
>
> Is on my todo list somewhere.
--
Piet Delaney Phone: (408) 200-5256
Blue Lane Technologies Fax: (408) 200-5299
10450 Bubb Rd.
Cupertino, Ca. 95014 Email: [email protected]