Hallo,
4.4.147 is the last kernel of the 4.4.x series that boots for me. All
subsequent versions panic on boot. How do I report this bug?
If I'm supposed to use https://bugzilla.kernel.org/ I don't know what
to fill into the fields. I don't even know if the longterm kernel falls
under "Mainline" or some other tree.
MSB
--
Fun chemistry experiments:
Sodium chloride doubles its volume
when combined with an equal amount of table salt.
On Fri, Aug 24, 2018 at 02:24:08PM +0200, Matthias B. wrote:
> Hallo,
>
> 4.4.147 is the last kernel of the 4.4.x series that boots for me. All
> subsequent versions panic on boot. How do I report this bug?
> If I'm supposed to use https://bugzilla.kernel.org/ I don't know what
> to fill into the fields. I don't even know if the longterm kernel falls
> under "Mainline" or some other tree.
It depends on what the panic looks like :)
Any hints? You can post it here if you want.
Also, if you can run 'git bisect' to track it down to the commit that
causes the problem, that is even better, as we can cc: all of the people
on that patch to get help from.
thanks,
greg k-h
On Fri, 24 Aug 2018 14:59:50 +0200
Greg KH <[email protected]> wrote:
> It depends on what the panic looks like :)
>
> Any hints? You can post it here if you want.
The attached image is everything I see on screen. Scrolling up does not
work.
>
> Also, if you can run 'git bisect' to track it down to the commit that
> causes the problem, that is even better, as we can cc: all of the
> people on that patch to get help from.
I can try.
Which git repository and branch stores the 4.4.x kernel and which
changeset/tag is the 4.4.147 kernel (which is the last one that works
for me)?
MSB
--
A man without light need not fear darkness.
The following is the dmesg output from my working 4.4.147 around the
time when the newer kernel panics.
0.425380] Bluetooth: HIDP (Human Interface Emulation) ver 1.2
[ 0.425932] Bluetooth: HIDP socket layer initialized
[ 0.426617] microcode: CPU0 sig=0x306c3, pf=0x2, revision=0x19
[ 0.427203] microcode: CPU1 sig=0x306c3, pf=0x2, revision=0x19
[ 0.427772] microcode: CPU2 sig=0x306c3, pf=0x2, revision=0x19
[ 0.428688] microcode: CPU3 sig=0x306c3, pf=0x2, revision=0x19
[ 0.429221] microcode: CPU4 sig=0x306c3, pf=0x2, revision=0x19
[ 0.429744] microcode: CPU5 sig=0x306c3, pf=0x2, revision=0x19
[ 0.430255] microcode: CPU6 sig=0x306c3, pf=0x2, revision=0x19
[ 0.430761] microcode: CPU7 sig=0x306c3, pf=0x2, revision=0x19
[ 0.431271] microcode: Microcode Update Driver: v2.01
<[email protected]>, Peter Oruba [ 0.431815] AVX2
version of gcm_enc/dec engaged. [ 0.432347] AES CTR mode by8
optimization enabled [ 0.433358] registered taskstats version 1
[ 0.434086] Btrfs loaded
[ 0.434991] rtc_cmos 00:02: setting system clock to 2018-08-24
13:35:45 UTC (1535117745) [ 0.435586] ALSA device list:
[ 0.436074] No soundcards found.
All of the times visible on screen for the panicking kernel are in this
gap.
[ 0.595677] snd_hda_intel 0000:01:00.1: Too many HDMI devices
[ 0.596251] snd_hda_intel 0000:01:00.1: Consider building the kernel
with CONFIG_SND_DYNAMIC_MINORS=y [ 0.596839] snd_hda_intel
0000:01:00.1: Too many HDMI devices [ 0.597412] snd_hda_intel
0000:01:00.1: Consider building the kernel with
CONFIG_SND_DYNAMIC_MINORS=y [ 0.612466] snd_hda_intel
0000:01:00.1: control 3:0:0:ELD:0 is already present [ 0.613100]
snd_hda_codec_hdmi: probe of hdaudioC0D0 failed with error -16
--
If God hadn't wanted me to be paranoid,
He wouldn't have given me such a vivid imagination.
On Fri, Aug 24, 2018 at 04:09:47PM +0200, Matthias B. wrote:
> The following is the dmesg output from my working 4.4.147 around the
> time when the newer kernel panics.
>
>
> 0.425380] Bluetooth: HIDP (Human Interface Emulation) ver 1.2
> [ 0.425932] Bluetooth: HIDP socket layer initialized
> [ 0.426617] microcode: CPU0 sig=0x306c3, pf=0x2, revision=0x19
> [ 0.427203] microcode: CPU1 sig=0x306c3, pf=0x2, revision=0x19
> [ 0.427772] microcode: CPU2 sig=0x306c3, pf=0x2, revision=0x19
> [ 0.428688] microcode: CPU3 sig=0x306c3, pf=0x2, revision=0x19
> [ 0.429221] microcode: CPU4 sig=0x306c3, pf=0x2, revision=0x19
> [ 0.429744] microcode: CPU5 sig=0x306c3, pf=0x2, revision=0x19
> [ 0.430255] microcode: CPU6 sig=0x306c3, pf=0x2, revision=0x19
> [ 0.430761] microcode: CPU7 sig=0x306c3, pf=0x2, revision=0x19
> [ 0.431271] microcode: Microcode Update Driver: v2.01
> <[email protected]>, Peter Oruba [ 0.431815] AVX2
> version of gcm_enc/dec engaged. [ 0.432347] AES CTR mode by8
> optimization enabled [ 0.433358] registered taskstats version 1
> [ 0.434086] Btrfs loaded
> [ 0.434991] rtc_cmos 00:02: setting system clock to 2018-08-24
> 13:35:45 UTC (1535117745) [ 0.435586] ALSA device list:
> [ 0.436074] No soundcards found.
>
>
>
> All of the times visible on screen for the panicking kernel are in this
> gap.
>
>
>
> [ 0.595677] snd_hda_intel 0000:01:00.1: Too many HDMI devices
> [ 0.596251] snd_hda_intel 0000:01:00.1: Consider building the kernel
> with CONFIG_SND_DYNAMIC_MINORS=y [ 0.596839] snd_hda_intel
> 0000:01:00.1: Too many HDMI devices [ 0.597412] snd_hda_intel
> 0000:01:00.1: Consider building the kernel with
> CONFIG_SND_DYNAMIC_MINORS=y [ 0.612466] snd_hda_intel
> 0000:01:00.1: control 3:0:0:ELD:0 is already present [ 0.613100]
> snd_hda_codec_hdmi: probe of hdaudioC0D0 failed with error -16
Have you tried enabling CONFIG_SND_DYNAMIC_MINORS like the kernel is
asking you to here? Does that help?
thanks,
greg k-h
On Fri, Aug 24, 2018 at 03:43:44PM +0200, Matthias B. wrote:
>
> On Fri, 24 Aug 2018 14:59:50 +0200
> Greg KH <[email protected]> wrote:
>
> > It depends on what the panic looks like :)
> >
> > Any hints? You can post it here if you want.
>
> The attached image is everything I see on screen. Scrolling up does not
> work.
>
> >
> > Also, if you can run 'git bisect' to track it down to the commit that
> > causes the problem, that is even better, as we can cc: all of the
> > people on that patch to get help from.
>
> I can try.
> Which git repository and branch stores the 4.4.x kernel and which
> changeset/tag is the 4.4.147 kernel (which is the last one that works
> for me)?
All of the stable trees are here:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/
you want the linux-4.4.y branch to work off of.
All of the kernels releases are tagged, so you can start with v4.4.147
as the good entry for 'git bisect'.
thanks,
greg k-h
On Fri, 24 Aug 2018 16:11:20 +0200
Greg KH <[email protected]> wrote:
> > [ 0.595677] snd_hda_intel 0000:01:00.1: Too many HDMI devices
> > [ 0.596251] snd_hda_intel 0000:01:00.1: Consider building the
> > kernel with CONFIG_SND_DYNAMIC_MINORS=y [ 0.596839] snd_hda_intel
> > 0000:01:00.1: Too many HDMI devices [ 0.597412] snd_hda_intel
> > 0000:01:00.1: Consider building the kernel with
> > CONFIG_SND_DYNAMIC_MINORS=y [ 0.612466] snd_hda_intel
> > 0000:01:00.1: control 3:0:0:ELD:0 is already present
> > [ 0.613100] snd_hda_codec_hdmi: probe of hdaudioC0D0 failed with
> > error -16
>
> Have you tried enabling CONFIG_SND_DYNAMIC_MINORS like the kernel is
> asking you to here? Does that help?
I've just tried and it doesn't help. Note that the above message is from
my working kernel and has a timestamp later than the panic, so if the
timing of the 152 kernel is somewhat similar to the 147, the panic
occurs before that snd_hda_intel code is even reached.
MSB
--
Bad comments reveal the bad programmer.
On Fri, 24 Aug 2018 16:12:54 +0200
Greg KH <[email protected]> wrote:
>
> All of the stable trees are here:
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/
> you want the linux-4.4.y branch to work off of.
>
> All of the kernels releases are tagged, so you can start with v4.4.147
> as the good entry for 'git bisect'.
I've never used git bisect on the kernel. Can I simply do "make" for
each step or do I need to "make mrproper" every time?
MSB
On Fri, Aug 24, 2018 at 04:33:38PM +0200, Matthias B. wrote:
> On Fri, 24 Aug 2018 16:12:54 +0200
> Greg KH <[email protected]> wrote:
>
> >
> > All of the stable trees are here:
> > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/
> > you want the linux-4.4.y branch to work off of.
> >
> > All of the kernels releases are tagged, so you can start with v4.4.147
> > as the good entry for 'git bisect'.
>
> I've never used git bisect on the kernel. Can I simply do "make" for
> each step or do I need to "make mrproper" every time?
'make oldconfig' and then 'make' should be just fine. But add '-j10' or
so (the number of your cpus*2), so the build goes faster.
thanks,
greg k-h
Bisect identified the problem. It's the attached patch. I applied it to
4.4.152 with patch -Rp1 and I'm running the resulting kernel now.
MSB
--
For every idiot-proof system there exists at least one system-proof
idiot.
On Fri, Aug 24, 2018 at 06:19:19PM +0200, Matthias B. wrote:
> Bisect identified the problem. It's the attached patch. I applied it to
> 4.4.152 with patch -Rp1 and I'm running the resulting kernel now.
>
> MSB
>
> --
> For every idiot-proof system there exists at least one system-proof
> idiot.
>
> >From 02ff2769edbce2261e981effbc3c4b98fae4faf0 Mon Sep 17 00:00:00 2001
> From: Andi Kleen <[email protected]>
> Date: Tue, 7 Aug 2018 15:09:39 -0700
> Subject: [PATCH] x86/mm/pat: Make set_memory_np() L1TF safe
>
> commit 958f79b9ee55dfaf00c8106ed1c22a2919e0028b upstream
>
> set_memory_np() is used to mark kernel mappings not present, but it has
> it's own open coded mechanism which does not have the L1TF protection of
> inverting the address bits.
>
> Replace the open coded PTE manipulation with the L1TF protecting low level
> PTE routines.
>
> Passes the CPA self test.
>
> Signed-off-by: Andi Kleen <[email protected]>
> Signed-off-by: Thomas Gleixner <[email protected]>
> [ dwmw2: Pull in pud_mkhuge() from commit a00cc7d9dd, and pfn_pud() ]
> Signed-off-by: David Woodhouse <[email protected]>
> [groeck: port to 4.4]
> Signed-off-by: Guenter Roeck <[email protected]>
> Signed-off-by: Greg Kroah-Hartman <[email protected]>
> ---
> arch/x86/include/asm/pgtable.h | 27 +++++++++++++++++++++++++++
> arch/x86/mm/pageattr.c | 8 ++++----
> 2 files changed, 31 insertions(+), 4 deletions(-)
<snip>
Guenter, another report of this patch causing an issue. Any ideas? I
am away from test systems this weekend, but can push out patches if
needed.
thanks,
greg k-h