Since 4.19.5 I have not been able to boot kernels on my Xen-hosted VM on
a system with an Intel Xeon L5520 processor (microcode 0x1d).
4.19.4 worked fine; I've tried kernels 4.19.5, 4.19.6, 4.19.7 4.19.9, 4.19.10,
4.20-rc7, and they all throw:
BUG: unable to handle kernel paging request at ffff88903fffc000
PGD 1c0c067 P4D 1c0c067 PUD 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
... [register dump and call trace omitted, since I believe I located the
offending source code]
Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
Kernel Offset: disabled
early in the system boot (fine details might differ; I did not carefully
record the result of each test).
I finally had time to do some bisection, and the problem traces back to commit
d52888aa2753e3063a9d3a0c9f72f94aa9809c15
in tree fbbb33771ac6c392caeb283163a594f1a7e6d04d
(x86/mm: Move LDT remap out of KASLR region on 5-level paging).
After backing-out this one commit, I have successfully built and booted
4.19.5, 4.19.10 and 4.20-rc7 (I haven't bothered trying other versions).
(The VM is currently running my patched 4.19.10 without any apparent problem.)
Notes: all 4.19.* builds are of Gentoo's "gentoo-sources" tree (which include
standard Gentoo patches), plus wireguard; the 4.20-rc7 build was unpatched.
I have been successfully running a different build of each 4.19.* version
on my laptop (with many more device drivers enabled), so the problem
is unlikely to be due to the build toolchain (which is common to both
of my build trees).
Let me know if there is more information that you would find helpful,
or if you have patches that you'd like me to test.
--Ken Pizzini
On 12/19/18 4:25 PM, Ken Pizzini wrote:
> Since 4.19.5 I have not been able to boot kernels on my Xen-hosted VM on
> a system with an Intel Xeon L5520 processor (microcode 0x1d).
>
> 4.19.4 worked fine; I've tried kernels 4.19.5, 4.19.6, 4.19.7 4.19.9, 4.19.10,
> 4.20-rc7, and they all throw:
> BUG: unable to handle kernel paging request at ffff88903fffc000
> PGD 1c0c067 P4D 1c0c067 PUD 0
> Oops: 0000 [#1] PREEMPT SMP NOPTI
> ... [register dump and call trace omitted, since I believe I located the
> offending source code]
> Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
> Kernel Offset: disabled
> early in the system boot (fine details might differ; I did not carefully
> record the result of each test).
>
> I finally had time to do some bisection, and the problem traces back to commit
> d52888aa2753e3063a9d3a0c9f72f94aa9809c15
> in tree fbbb33771ac6c392caeb283163a594f1a7e6d04d
> (x86/mm: Move LDT remap out of KASLR region on 5-level paging).
>
> After backing-out this one commit, I have successfully built and booted
> 4.19.5, 4.19.10 and 4.20-rc7 (I haven't bothered trying other versions).
> (The VM is currently running my patched 4.19.10 without any apparent problem.)
>
> Notes: all 4.19.* builds are of Gentoo's "gentoo-sources" tree (which include
> standard Gentoo patches), plus wireguard; the 4.20-rc7 build was unpatched.
> I have been successfully running a different build of each 4.19.* version
> on my laptop (with many more device drivers enabled), so the problem
> is unlikely to be due to the build toolchain (which is common to both
> of my build trees).
>
>
> Let me know if there is more information that you would find helpful,
> or if you have patches that you'd like me to test.
>
> --Ken Pizzini
This is addressed by https://lkml.org/lkml/2018/12/11/266 but has not
been merged yet.
(Next time for Xen issues please copy [email protected])
-boris
On Thu, Dec 20, 2018 at 11:12:40AM -0500, Boris Ostrovsky wrote:
> This is addressed by https://lkml.org/lkml/2018/12/11/266 but has not
> been merged yet.
Confirmed: applying the patch in that posting to my Gentoo
sys-kernel/gentoo-sources-4.19.10 tree results in a working
system for me.
Thanks,
--Ken Pizzini