2022-03-30 14:39:14

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: mainline: qemu_arm64 boot failed - Failed to start Hostname Service - Stopped Serial Getty on ttyAMA0

On Wed, Mar 30, 2022 at 01:45:14PM +0530, Naresh Kamboju wrote:
> On Wed, 30 Mar 2022 at 13:32, Greg Kroah-Hartman
> <[email protected]> wrote:
> >
> > On Wed, Mar 30, 2022 at 12:46:40PM +0530, Naresh Kamboju wrote:
> > > Linux mainline kernel booting qemu_arm64 and Raspberry Pi 4 boot
> > > failed [1] and [2].
> > >
> > > Boot log:
> > > [ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x413fd0c1]
> > > [ 0.000000] Linux version 5.17.0 (tuxmake@tuxmake)
> > > (aarch64-linux-gnu-gcc (Debian 11.2.0-18) 11.2.0, GNU ld (GNU Binutils
> > > for Debian) 2.38) #1 SMP PREEMPT @1648607824
> > > [ 0.000000] Machine model: linux,dummy-virt
> > >
> > > <trim>
> > >
> > > Starting Hostname Service...
> > > [FAILED] Failed to start Hostname Service.
> > > See 'systemctl status systemd-hostnamed.service' for details.
> > > [ 2.453172] random: crng init done
> > > [ 2.453522] random: 7 urandom warning(s) missed due to ratelimiting
> > > [ 2.483013] audit: type=1334 audit(1648621045.348:8): prog-id=0 op=UNLOAD
> > > [ 2.483953] audit: type=1334 audit(1648621045.348:9): prog-id=0 op=UNLOAD
> > > [ 2.609230] audit: type=1334 audit(1648621045.476:10): prog-id=10 op=LOAD
> > > [ OK ] Started Resize root filesystem to fit available disk space.
> > > [ ] A start job is running for Network Manager (6s / 1min 30s)
> > > ] A start job is running for Network Manager (7s / 1min 30s)[ ] A
> > > start job is running for Network Manager (7s / 1min 30s)[ ] A start
> > > job is running for Network Manager (8s / 1min 30s)[ ] A start job is
> > > running for Network Manager (9s / 1min 30s)[ ] A start job is
> > > running for Network Manager (9s / 1min 30s)[ [ A start job is
> > > running for Network Manager (10s / 1min 30s)[ ] A start job is
> > > running for Network Manager (10s / 1min 30s)[ [ A start job is
> > > running for Network Manager (11s / 1min 30s)[ ] A start job is
> > > running for Network Manager (12s / 1min 30s)[ ] A start job is
> > > running for Network Manager (12s / 1min 30s)[ ] A start job is
> > > running for Network Manager (13s / 1min 30s)[ ] A start job is
> > > running for Network Manager (13s / 1min 30s) ] A start job is
> > > running for Network Manager (14s / 1min 30s)[ ] A start job is
> > > running for Network Manager (14s / 1min 30s) ] A start job is
> > > running for Network Manager (15s / 1min 30s)[ ] A start job is
> > > running for Network Manager (16s / 1min 30s)[ ] A start job is
> > > running for Network Manager (16s / 1min 30s)[ ] A start job is
> > > running for Network Manager (17s / 1min 30s)[ ] A start job is
> > > running for Network Manager (17s / 1min 30s)[ [ A start job is
> > > running for Network Manager (18s / 1min 30s)[ ] A start job is
> > > running for Network Manager (19s / 1min 30s)[ [ A start job is
> > > running for Network Manager (19s / 1min 30s)[ ] A start job is
> > > running for Network Manager (19s / 1min 30s)[ ] A start job is
> > > running for Network Manager (20s / 1min 30s)[ ] A start job is
> > > running for Network Manager (20s / 1min 30s)[ ] A start job is
> > > running for Network Manager (21s / 1min 30s) ] A start job is
> > > running for Network Manager (21s / 1min 30s)[ ] A start job is
> > > running for Network Manager (22s / 1min 30s) ] A start job is
> > > running for Network Manager (22s / 1min 30s)[ ] A start job is
> > > running for Network Manager (23s / 1min 30s)[ ] A start job is
> > > running for Network Manager (23s / 1min 30s)[ ] A start job is
> > > running for Network Manager (24s / 1min 30s)[ ] A start job is
> > > running for Network Manager (24s / 1min 30s)[ [ A start job is
> > > running for Network Manager (25s / 1min 30s)[ ] A start job is
> > > running for Network Manager (25s / 1min 30s)[ OK ] Started Network
> > > Manager.
> > > [ OK ] Listening on Load/Save RF Kill Switch Status /dev/rfkill Watch.
> > > Starting Network Manager Script Dispatcher Service...
> > > [ 27.359980] kauditd_printk_skb: 11 callbacks suppressed
> > > [ 27.359988] audit: type=1701 audit(1648621070.228:22):
> > > auid=4294967295 uid=0 gid=0 ses=4294967295 pid=313 comm=\"(spatcher)\"
> > > exe=\"/lib/systemd/systemd\" sig=6 res=1
> > > [ OK ] Reached target Network.
> > > Starting DNS forwarder and DHCP server...
> > > [ 27.367248] audit: type=1701 audit(1648621070.232:23):
> > > auid=4294967295 uid=0 gid=0 ses=4294967295 pid=315 comm=\"(dnsmasq)\"
> > > exe=\"/lib/systemd/systemd\" sig=6 res=1
> > > Starting Permit User Sessions...
> > > [ 27.371110] audit: type=1701 audit(1648621070.240:24):
> > > auid=4294967295 uid=0 gid=0 ses=4294967295 pid=317 comm=\"(sessions)\"
> > > exe=\"/lib/systemd/systemd\" sig=6 res=1
> > > [FAILED] Failed to start Network Manager Script Dispatcher Service.
> > > See 'systemctl status NetworkManager-dispatcher.service' for details.
> > > [FAILED] Failed to start DNS forwarder and DHCP server.
> > > See 'systemctl status dnsmasq.service' for details.
> > > [FAILED] Failed to start Permit User Sessions.
> > > See 'systemctl status systemd-user-sessions.service' for details.
> > > [ OK ] Started Getty on tty1.
> > > [ OK ] Started Serial Getty on ttyAMA0.
> > > [ OK ] Reached target Login Prompts.
> > > [ OK ] Reached target Multi-User System.
> > > Starting Update UTMP about System Runlevel Changes...
> > > [ 27.402975] audit: type=1701 audit(1648621070.268:25):
> > > auid=4294967295 uid=0 gid=0 ses=4294967295 pid=322 comm=\"(ate-utmp)\"
> > > exe=\"/lib/systemd/systemd\" sig=6 res=1
> > > [FAILED] Failed to start Update UTMP about System Runlevel Changes.
> > > See 'systemctl status systemd-update-utmp-runlevel.service' for details.
> > > [ 32.398140] audit: type=1701 audit(1648621075.264:26):
> > > auid=4294967295 uid=0 gid=0 ses=4294967295 pid=320 comm=\"(agetty)\"
> > > exe=\"/lib/systemd/systemd\" sig=6 res=1
> > > [ 32.399187] audit: type=1701 audit(1648621075.264:27):
> > > auid=4294967295 uid=0 gid=0 ses=4294967295 pid=321 comm=\"(agetty)\"
> > > exe=\"/lib/systemd/systemd\" sig=6 res=1
> > > [ OK ] Stopped Serial Getty on ttyAMA0.
> > > [ OK ] Started Serial Getty on ttyAMA0.
> > > [ OK ] Stopped Getty on tty1.
> > > [ OK ] Started Getty on tty1.
> > >
> > > Reported-by: Linux Kernel Functional Testing <[email protected]>
> > >
> > > --
> > > Linaro LKFT
> > > https://lkft.linaro.org
> > >
> > > [1] https://lkft.validation.linaro.org/scheduler/job/4809453#L252
> > > [2] https://lkft.validation.linaro.org/scheduler/job/4809125#L386
> > >
> >
> > That's a lot of userspace logs, any kernel logs showing that anything
> > failed?
>
> I do not see kernel logs failures here.

Great, then the kernel is working just fine! :)

Seriously, without some sort of hint, it's going to be impossible for us
to know what to do here...


2022-03-31 03:55:40

by Naresh Kamboju

[permalink] [raw]
Subject: Re: mainline: qemu_arm64 boot failed - Failed to start Hostname Service - Stopped Serial Getty on ttyAMA0

On Wed, 30 Mar 2022 at 13:54, Greg Kroah-Hartman
<[email protected]> wrote:

> > > That's a lot of userspace logs, any kernel logs showing that anything
> > > failed?
> >
> > I do not see kernel logs failures here.
>
> Great, then the kernel is working just fine! :)
>
> Seriously, without some sort of hint, it's going to be impossible for us
> to know what to do here...

Ander is bisecting this problem.

OTOH, I am looking into test history and found the head commit is booting pass.
The problem report I have sent is a head-1 test report.

This means that, current Linus master boot pass.

Do you see any relation of top fix commit vs the report I sent.

BAD: 965181d7ef7e (head -1 )
GOOD: d888c83fcec75194a8 ( head)

git log 965181d7ef7e..d888c83fcec75194a8

commit d888c83fcec75194a8a48ccd283953bdba7b2550 (HEAD -> linux-master,
linux/master)
Author: Linus Torvalds <[email protected]>
Date: Tue Mar 29 23:29:18 2022 -0700

fs: fix fd table size alignment properly

Jason Donenfeld reports that my commit 1c24a186398f ("fs: fd tables have
to be multiples of BITS_PER_LONG") doesn't work, and the reason is an
embarrassing brown-paper-bag bug.

Yes, we want to align the number of fds to BITS_PER_LONG, and yes, the
reason they might not be aligned is because the incoming 'max_fd'
argument might not be aligned.

But aligining the argument - while simple - will cause a "infinitely
big" maxfd (eg NR_OPEN_MAX) to just overflow to zero. Which most
definitely isn't what we want either.

The obvious fix was always just to do the alignment last, but I had
moved it earlier just to make the patch smaller and the code look
simpler. Duh. It certainly made _me_ look simple.

Fixes: 1c24a186398f ("fs: fd tables have to be multiples of BITS_PER_LONG")
Reported-and-tested-by: Jason A. Donenfeld <[email protected]>
Cc: Fedor Pchelkin <[email protected]>
Cc: Alexey Khoroshilov <[email protected]>
Cc: Christian Brauner <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

- Naresh