Hi!
Since you seem to care about kgdb, I figured you might want to fix this
before I mark it broken on x86 (we've been considering doing that for a
while).
AFAICT the whole debugreg usage of kgdb-x86_64 is completely hosed; it
doesn't respsect the normal exclusion zones as per arch_build_bp_info().
That is, breakpoints must never be in:
- in the cpu_entry_area
- in .entry.text
- in .noinstr.text
- in anything else marked NOKPROBE
by not respecting these constraints it is trivial to completely and
utterly hose the machine. The entry rework that is current underway will
explicitly not deal with #DB triggering in any of those places.
On Mon, May 25, 2020 at 10:36:05AM +0200, Peter Zijlstra wrote:
> Hi!
>
> Since you seem to care about kgdb, I figured you might want to fix this
> before I mark it broken on x86 (we've been considering doing that for a
> while).
>
> AFAICT the whole debugreg usage of kgdb-x86_64 is completely hosed; it
> doesn't respsect the normal exclusion zones as per arch_build_bp_info().
>
> That is, breakpoints must never be in:
>
> - in the cpu_entry_area
> - in .entry.text
> - in .noinstr.text
> - in anything else marked NOKPROBE
>
> by not respecting these constraints it is trivial to completely and
> utterly hose the machine. The entry rework that is current underway will
> explicitly not deal with #DB triggering in any of those places.
This also very much includes single stepping those bits. Which KGDB
obviously also does not respects.
On Mon, May 25, 2020 at 11:18:32AM +0200, Peter Zijlstra wrote:
> On Mon, May 25, 2020 at 10:36:05AM +0200, Peter Zijlstra wrote:
> > Hi!
> >
> > Since you seem to care about kgdb, I figured you might want to fix this
> > before I mark it broken on x86 (we've been considering doing that for a
> > while).
> >
> > AFAICT the whole debugreg usage of kgdb-x86_64 is completely hosed; it
> > doesn't respsect the normal exclusion zones as per arch_build_bp_info().
> >
> > That is, breakpoints must never be in:
> >
> > - in the cpu_entry_area
> > - in .entry.text
> > - in .noinstr.text
> > - in anything else marked NOKPROBE
> >
> > by not respecting these constraints it is trivial to completely and
> > utterly hose the machine. The entry rework that is current underway will
> > explicitly not deal with #DB triggering in any of those places.
>
> This also very much includes single stepping those bits. Which KGDB
> obviously also does not respects.
For breakpoints there's already a pre-poke validation hook that
architectures can override if they want to. I can modify the default
implementation to include checking the nokprobe list.
Stepping is a bit more complex. There are hooks for some of the
underlying work but not pre-step validation hook. I'll see if we can add
one.
Daniel.
On Tue, May 26, 2020 at 05:16:21PM +0100, Daniel Thompson wrote:
> On Mon, May 25, 2020 at 11:18:32AM +0200, Peter Zijlstra wrote:
> > On Mon, May 25, 2020 at 10:36:05AM +0200, Peter Zijlstra wrote:
> > > Hi!
> > >
> > > Since you seem to care about kgdb, I figured you might want to fix this
> > > before I mark it broken on x86 (we've been considering doing that for a
> > > while).
> > >
> > > AFAICT the whole debugreg usage of kgdb-x86_64 is completely hosed; it
> > > doesn't respsect the normal exclusion zones as per arch_build_bp_info().
> > >
> > > That is, breakpoints must never be in:
> > >
> > > - in the cpu_entry_area
> > > - in .entry.text
> > > - in .noinstr.text
> > > - in anything else marked NOKPROBE
> > >
> > > by not respecting these constraints it is trivial to completely and
> > > utterly hose the machine. The entry rework that is current underway will
> > > explicitly not deal with #DB triggering in any of those places.
> >
> > This also very much includes single stepping those bits. Which KGDB
> > obviously also does not respects.
>
> For breakpoints there's already a pre-poke validation hook that
> architectures can override if they want to. I can modify the default
> implementation to include checking the nokprobe list.
Excellent, and I suppose the arch callback should be changed to share
code with arch_build_bp_info(), which Lai was extending here:
https://lkml.kernel.org/r/[email protected]
> Stepping is a bit more complex. There are hooks for some of the
> underlying work but not pre-step validation hook. I'll see if we can add
> one.
That'd be great; because where we're going getting this wrong is
insta-fail.
Another point to look at is the whole dbg_is_early; I suspect that's
similarly wrecked, the entry code isn't more robust early on.