2021-08-07 14:29:59

by Mikael Pettersson

[permalink] [raw]
Subject: [BISECTED][REGRESSION] 5.10.56 longterm kernel breakage on m68k/aranym

I updated the 5.10 longterm kernel on one of my m68k/aranym VMs from
5.10.47 to 5.10.56, and the new kernel failed to boot:

ARAnyM 1.1.0
Using config file: 'aranym1.headless.config'
Could not open joystick 0
ARAnyM RTC Timer: /dev/rtc: Permission denied
ARAnyM LILO: Error loading ramdisk 'root.bin'
Blitter tried to read byte from register ff8a00 at 0077ee

At this point it kept running, but produced no output to the console,
and would never get to the point of starting user-space. Attaching gdb
to aranym showed nothing interesting, i.e. it seemed to be executing
normally.

A git bisect identified the following commit between 5.10.52 and
5.10.53 as the culprit:
# first bad commit: [9e1cf2d1ed37c934c9935f2c0b2f8b15d9355654]
mm/userfaultfd: fix uffd-wp special cases for fork()

5.10.52, 5.11.22, 5.12.19, and 5.13.8 all boot fine. 5.10.53 to
5.10.56 all fail as described above.

grep ^CONFIG .config below, everything omitted is of course disabled,
including I might add CONFIG_USERFAULTFD.

/Mikael

CONFIG_CC_VERSION_TEXT="m68k-unknown-linux-gcc (GCC) 10.3.1 20210424"
CONFIG_CC_IS_GCC=y
CONFIG_GCC_VERSION=100301
CONFIG_LD_VERSION=231010000
CONFIG_CLANG_VERSION=0
CONFIG_LLD_VERSION=0
CONFIG_CC_CAN_LINK=y
CONFIG_CC_CAN_LINK_STATIC=y
CONFIG_CC_HAS_ASM_GOTO=y
CONFIG_CC_HAS_ASM_INLINE=y
CONFIG_IRQ_WORK=y
CONFIG_BROKEN_ON_SMP=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION=""
CONFIG_BUILD_SALT=""
CONFIG_DEFAULT_INIT=""
CONFIG_DEFAULT_HOSTNAME="(none)"
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_PREEMPT_NONE=y
CONFIG_TICK_CPU_ACCOUNTING=y
CONFIG_TINY_RCU=y
CONFIG_SRCU=y
CONFIG_TINY_SRCU=y
CONFIG_LOG_BUF_SHIFT=15
CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT=13
CONFIG_CGROUPS=y
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_RD_GZIP=y
CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE=y
CONFIG_SYSCTL=y
CONFIG_HAVE_UID16=y
CONFIG_BPF=y
CONFIG_EXPERT=y
CONFIG_UID16=y
CONFIG_MULTIUSER=y
CONFIG_FHANDLE=y
CONFIG_POSIX_TIMERS=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_FUTEX_PI=y
CONFIG_HAVE_FUTEX_CMPXCHG=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_ADVISE_SYSCALLS=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_BASE_RELATIVE=y
CONFIG_EMBEDDED=y
CONFIG_SLUB=y
CONFIG_SLAB_MERGE_DEFAULT=y
CONFIG_M68K=y
CONFIG_CPU_BIG_ENDIAN=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_TIME_LOW_RES=y
CONFIG_NO_IOPORT_MAP=y
CONFIG_ZONE_DMA=y
CONFIG_HZ=100
CONFIG_PGTABLE_LEVELS=3
CONFIG_MMU=y
CONFIG_MMU_MOTOROLA=y
CONFIG_M68KCLASSIC=y
CONFIG_M68020=y
CONFIG_M68030=y
CONFIG_M68040=y
CONFIG_M68060=y
CONFIG_M68KFPU_EMU=y
CONFIG_M68KFPU_EMU_EXTRAPREC=y
CONFIG_ADVANCED=y
CONFIG_RMW_INSNS=y
CONFIG_ARCH_DISCONTIGMEM_ENABLE=y
CONFIG_NODES_SHIFT=3
CONFIG_CPU_HAS_ADDRESS_SPACES=y
CONFIG_FPU=y
CONFIG_ATARI=y
CONFIG_ATARI_KBD_CORE=y
CONFIG_PROC_HARDWARE=y
CONFIG_NATFEAT=y
CONFIG_NFBLOCK=y
CONFIG_NFCON=y
CONFIG_NFETH=y
CONFIG_CRASH_CORE=y
CONFIG_SET_FS=y
CONFIG_ARCH_32BIT_OFF_T=y
CONFIG_HAVE_ASM_MODVERSIONS=y
CONFIG_MMU_GATHER_NO_RANGE=y
CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG=y
CONFIG_ARCH_WANT_IPC_PARSE_VERSION=y
CONFIG_HAVE_MOD_ARCH_SPECIFIC=y
CONFIG_MODULES_USE_ELF_RELA=y
CONFIG_MODULES_USE_ELF_REL=y
CONFIG_HAVE_ARCH_NVRAM_OPS=y
CONFIG_OLD_SIGSUSPEND3=y
CONFIG_OLD_SIGACTION=y
CONFIG_COMPAT_32BIT_TIME=y
CONFIG_ARCH_NO_PREEMPT=y
CONFIG_RT_MUTEXES=y
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_BLOCK=y
CONFIG_BLK_SCSI_REQUEST=y
CONFIG_BLK_DEV_BSG=y
CONFIG_PARTITION_ADVANCED=y
CONFIG_ATARI_PARTITION=y
CONFIG_MSDOS_PARTITION=y
CONFIG_INLINE_SPIN_UNLOCK_IRQ=y
CONFIG_INLINE_READ_UNLOCK=y
CONFIG_INLINE_READ_UNLOCK_IRQ=y
CONFIG_INLINE_WRITE_UNLOCK=y
CONFIG_INLINE_WRITE_UNLOCK_IRQ=y
CONFIG_BINFMT_ELF=y
CONFIG_ELFCORE=y
CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS=y
CONFIG_BINFMT_SCRIPT=y
CONFIG_ARCH_HAS_BINFMT_FLAT=y
CONFIG_BINFMT_FLAT_ARGVP_ENVP_ON_STACK=y
CONFIG_HAVE_AOUT=y
CONFIG_COREDUMP=y
CONFIG_DISCONTIGMEM=y
CONFIG_FLAT_NODE_MEM_MAP=y
CONFIG_NEED_MULTIPLE_NODES=y
CONFIG_SPLIT_PTLOCK_CPUS=4
CONFIG_BOUNCE=y
CONFIG_VIRT_TO_BUS=y
CONFIG_DEFAULT_MMAP_MIN_ADDR=4096
CONFIG_NEED_PER_CPU_KM=y
CONFIG_NET=y
CONFIG_PACKET=y
CONFIG_UNIX=y
CONFIG_UNIX_SCM=y
CONFIG_INET=y
CONFIG_TCP_CONG_CUBIC=y
CONFIG_DEFAULT_TCP_CONG="cubic"
CONFIG_IPV6=m
CONFIG_HAVE_NET_DSA=y
CONFIG_NET_RX_BUSY_POLL=y
CONFIG_BQL=y
CONFIG_DEVTMPFS=y
CONFIG_DEVTMPFS_MOUNT=y
CONFIG_STANDALONE=y
CONFIG_PREVENT_FIRMWARE_BUILD=y
CONFIG_GENERIC_CPU_DEVICES=y
CONFIG_BLK_DEV=y
CONFIG_BLK_DEV_LOOP=m
CONFIG_BLK_DEV_LOOP_MIN_COUNT=8
CONFIG_HAVE_IDE=y
CONFIG_SCSI_MOD=y
CONFIG_NETDEVICES=y
CONFIG_ETHERNET=y
CONFIG_INPUT=y
CONFIG_INPUT_MOUSEDEV=y
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
CONFIG_INPUT_EVDEV=m
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATARI=y
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_ATARI=y
CONFIG_INPUT_MISC=y
CONFIG_INPUT_M68K_BEEP=y
CONFIG_TTY=y
CONFIG_VT=y
CONFIG_CONSOLE_TRANSLATIONS=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
CONFIG_VT_HW_CONSOLE_BINDING=y
CONFIG_UNIX98_PTYS=y
CONFIG_LDISC_AUTOLOAD=y
CONFIG_NVRAM=y
CONFIG_SSB_POSSIBLE=y
CONFIG_BCMA_POSSIBLE=y
CONFIG_FB_CMDLINE=y
CONFIG_FB_NOTIFY=y
CONFIG_FB=y
CONFIG_FB_CFB_FILLRECT=y
CONFIG_FB_CFB_COPYAREA=y
CONFIG_FB_CFB_IMAGEBLIT=y
CONFIG_FB_ATARI=y
CONFIG_DUMMY_CONSOLE=y
CONFIG_DUMMY_CONSOLE_COLUMNS=80
CONFIG_DUMMY_CONSOLE_ROWS=25
CONFIG_FRAMEBUFFER_CONSOLE=y
CONFIG_USB_OHCI_LITTLE_ENDIAN=y
CONFIG_RTC_LIB=y
CONFIG_RTC_CLASS=y
CONFIG_RTC_HCTOSYS=y
CONFIG_RTC_HCTOSYS_DEVICE="rtc0"
CONFIG_RTC_INTF_SYSFS=y
CONFIG_RTC_INTF_PROC=y
CONFIG_RTC_INTF_DEV=y
CONFIG_RTC_DRV_GENERIC=y
CONFIG_FS_IOMAP=y
CONFIG_EXT4_FS=y
CONFIG_EXT4_USE_FOR_EXT2=y
CONFIG_EXT4_FS_POSIX_ACL=y
CONFIG_EXT4_FS_SECURITY=y
CONFIG_JBD2=y
CONFIG_FS_MBCACHE=y
CONFIG_FS_POSIX_ACL=y
CONFIG_EXPORTFS=y
CONFIG_FILE_LOCKING=y
CONFIG_FSNOTIFY=y
CONFIG_DNOTIFY=y
CONFIG_INOTIFY_USER=y
CONFIG_FANOTIFY=y
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_PROC_SYSCTL=y
CONFIG_KERNFS=y
CONFIG_SYSFS=y
CONFIG_TMPFS=y
CONFIG_TMPFS_POSIX_ACL=y
CONFIG_TMPFS_XATTR=y
CONFIG_MEMFD_CREATE=y
CONFIG_SECURITY=y
CONFIG_HAVE_HARDENED_USERCOPY_ALLOCATOR=y
CONFIG_DEFAULT_SECURITY_DAC=y
CONFIG_LSM="lockdown,yama,loadpin,safesetid,integrity,bpf"
CONFIG_INIT_STACK_NONE=y
CONFIG_CRYPTO=y
CONFIG_CRYPTO_ALGAPI=y
CONFIG_CRYPTO_ALGAPI2=y
CONFIG_CRYPTO_HASH=y
CONFIG_CRYPTO_HASH2=y
CONFIG_CRYPTO_MANAGER_DISABLE_TESTS=y
CONFIG_CRYPTO_CRC32C=y
CONFIG_CRYPTO_AES=y
CONFIG_CRYPTO_LIB_AES=y
CONFIG_CRYPTO_LIB_POLY1305_RSIZE=1
CONFIG_BITREVERSE=y
CONFIG_GENERIC_STRNCPY_FROM_USER=y
CONFIG_GENERIC_STRNLEN_USER=y
CONFIG_GENERIC_NET_UTILS=y
CONFIG_GENERIC_PCI_IOMAP=y
CONFIG_GENERIC_IOMAP=y
CONFIG_CRC16=y
CONFIG_CRC32=y
CONFIG_CRC32_SLICEBY8=y
CONFIG_ZLIB_INFLATE=y
CONFIG_DECOMPRESS_GZIP=y
CONFIG_GENERIC_ALLOCATOR=y
CONFIG_HAS_IOMEM=y
CONFIG_HAS_DMA=y
CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE=y
CONFIG_ARCH_HAS_DMA_PREP_COHERENT=y
CONFIG_DMA_NONCOHERENT_MMAP=y
CONFIG_DMA_COHERENT_POOL=y
CONFIG_DMA_REMAP=y
CONFIG_DMA_DIRECT_REMAP=y
CONFIG_DQL=y
CONFIG_NLATTR=y
CONFIG_GENERIC_ATOMIC64=y
CONFIG_FONT_SUPPORT=y
CONFIG_FONT_8x8=y
CONFIG_FONT_8x16=y
CONFIG_SBITMAP=y
CONFIG_CONSOLE_LOGLEVEL_DEFAULT=7
CONFIG_CONSOLE_LOGLEVEL_QUIET=4
CONFIG_MESSAGE_LOGLEVEL_DEFAULT=4
CONFIG_DEBUG_BUGVERBOSE=y
CONFIG_ENABLE_MUST_CHECK=y
CONFIG_FRAME_WARN=1024
CONFIG_STRIP_ASM_SYMS=y
CONFIG_SECTION_MISMATCH_WARN_ONLY=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE=0x1
CONFIG_DEBUG_KERNEL=y
CONFIG_CC_HAS_WORKING_NOSANITIZE_ADDRESS=y
CONFIG_PANIC_ON_OOPS_VALUE=0
CONFIG_PANIC_TIMEOUT=0
CONFIG_HAVE_DEBUG_BUGVERBOSE=y
CONFIG_CC_HAS_SANCOV_TRACE_PC=y


2021-08-07 23:23:35

by Finn Thain

[permalink] [raw]
Subject: Re: [BISECTED][REGRESSION] 5.10.56 longterm kernel breakage on m68k/aranym

On Sat, 7 Aug 2021, Mikael Pettersson wrote:

> I updated the 5.10 longterm kernel on one of my m68k/aranym VMs from
> 5.10.47 to 5.10.56, and the new kernel failed to boot:
>
> ARAnyM 1.1.0
> Using config file: 'aranym1.headless.config'
> Could not open joystick 0
> ARAnyM RTC Timer: /dev/rtc: Permission denied
> ARAnyM LILO: Error loading ramdisk 'root.bin'
> Blitter tried to read byte from register ff8a00 at 0077ee
>
> At this point it kept running, but produced no output to the console,
> and would never get to the point of starting user-space. Attaching gdb
> to aranym showed nothing interesting, i.e. it seemed to be executing
> normally.
>
> A git bisect identified the following commit between 5.10.52 and
> 5.10.53 as the culprit:
> # first bad commit: [9e1cf2d1ed37c934c9935f2c0b2f8b15d9355654]
> mm/userfaultfd: fix uffd-wp special cases for fork()
>

That commit appeared in mainline between v5.13 and v5.14-rc1. Is mainline
also affected? e.g. v5.14-rc4.

> 5.10.52, 5.11.22, 5.12.19, and 5.13.8 all boot fine. 5.10.53 to
> 5.10.56 all fail as described above.
>
> grep ^CONFIG .config below, everything omitted is of course disabled,
> including I might add CONFIG_USERFAULTFD.
>
> /Mikael
>

2021-08-08 11:14:48

by Mikael Pettersson

[permalink] [raw]
Subject: Re: [BISECTED][REGRESSION] 5.10.56 longterm kernel breakage on m68k/aranym

On Sun, Aug 8, 2021 at 1:20 AM Finn Thain <[email protected]> wrote:
>
> On Sat, 7 Aug 2021, Mikael Pettersson wrote:
>
> > I updated the 5.10 longterm kernel on one of my m68k/aranym VMs from
> > 5.10.47 to 5.10.56, and the new kernel failed to boot:
> >
> > ARAnyM 1.1.0
> > Using config file: 'aranym1.headless.config'
> > Could not open joystick 0
> > ARAnyM RTC Timer: /dev/rtc: Permission denied
> > ARAnyM LILO: Error loading ramdisk 'root.bin'
> > Blitter tried to read byte from register ff8a00 at 0077ee
> >
> > At this point it kept running, but produced no output to the console,
> > and would never get to the point of starting user-space. Attaching gdb
> > to aranym showed nothing interesting, i.e. it seemed to be executing
> > normally.
> >
> > A git bisect identified the following commit between 5.10.52 and
> > 5.10.53 as the culprit:
> > # first bad commit: [9e1cf2d1ed37c934c9935f2c0b2f8b15d9355654]
> > mm/userfaultfd: fix uffd-wp special cases for fork()
> >
>
> That commit appeared in mainline between v5.13 and v5.14-rc1. Is mainline
> also affected? e.g. v5.14-rc4.

5.14-rc4 boots fine. I suspect the commit has some dependency that
hasn't been backported to 5.10 stable.

2021-08-09 02:07:35

by Finn Thain

[permalink] [raw]
Subject: Re: [BISECTED][REGRESSION] 5.10.56 longterm kernel breakage on m68k/aranym

On Sun, 8 Aug 2021, Mikael Pettersson wrote:

> On Sun, Aug 8, 2021 at 1:20 AM Finn Thain <[email protected]> wrote:
> >
> > On Sat, 7 Aug 2021, Mikael Pettersson wrote:
> >
> > > I updated the 5.10 longterm kernel on one of my m68k/aranym VMs from
> > > 5.10.47 to 5.10.56, and the new kernel failed to boot:
> > >
> > > ARAnyM 1.1.0
> > > Using config file: 'aranym1.headless.config'
> > > Could not open joystick 0
> > > ARAnyM RTC Timer: /dev/rtc: Permission denied
> > > ARAnyM LILO: Error loading ramdisk 'root.bin'
> > > Blitter tried to read byte from register ff8a00 at 0077ee
> > >
> > > At this point it kept running, but produced no output to the console,
> > > and would never get to the point of starting user-space. Attaching gdb
> > > to aranym showed nothing interesting, i.e. it seemed to be executing
> > > normally.
> > >
> > > A git bisect identified the following commit between 5.10.52 and
> > > 5.10.53 as the culprit:
> > > # first bad commit: [9e1cf2d1ed37c934c9935f2c0b2f8b15d9355654]
> > > mm/userfaultfd: fix uffd-wp special cases for fork()
> > >
> >
> > That commit appeared in mainline between v5.13 and v5.14-rc1. Is mainline
> > also affected? e.g. v5.14-rc4.
>
> 5.14-rc4 boots fine. I suspect the commit has some dependency that
> hasn't been backported to 5.10 stable.
>

On mainline, 9e1cf2d1ed3 is known as commit 8f34f1eac382 ("mm/userfaultfd:
fix uffd-wp special cases for fork()").

There are differences between the two commits that may be relevant. I
don't know.

If you checkout 8f34f1eac382 and if that works, it would indicate either
missing dependencies in -stable, or those differences are important.

OTOH, if 8f34f1eac382 fails in the same way as linux-5.10.y, it would
indicate that -stable is missing a fix that's present in v5.14-rc4.

2021-08-09 10:35:24

by Mikael Pettersson

[permalink] [raw]
Subject: Re: [BISECTED][REGRESSION] 5.10.56 longterm kernel breakage on m68k/aranym

On Mon, Aug 9, 2021 at 3:59 AM Finn Thain <[email protected]> wrote:
>
> On Sun, 8 Aug 2021, Mikael Pettersson wrote:
>
> > On Sun, Aug 8, 2021 at 1:20 AM Finn Thain <[email protected]> wrote:
> > >
> > > On Sat, 7 Aug 2021, Mikael Pettersson wrote:
> > >
> > > > I updated the 5.10 longterm kernel on one of my m68k/aranym VMs from
> > > > 5.10.47 to 5.10.56, and the new kernel failed to boot:
> > > >
> > > > ARAnyM 1.1.0
> > > > Using config file: 'aranym1.headless.config'
> > > > Could not open joystick 0
> > > > ARAnyM RTC Timer: /dev/rtc: Permission denied
> > > > ARAnyM LILO: Error loading ramdisk 'root.bin'
> > > > Blitter tried to read byte from register ff8a00 at 0077ee
> > > >
> > > > At this point it kept running, but produced no output to the console,
> > > > and would never get to the point of starting user-space. Attaching gdb
> > > > to aranym showed nothing interesting, i.e. it seemed to be executing
> > > > normally.
> > > >
> > > > A git bisect identified the following commit between 5.10.52 and
> > > > 5.10.53 as the culprit:
> > > > # first bad commit: [9e1cf2d1ed37c934c9935f2c0b2f8b15d9355654]
> > > > mm/userfaultfd: fix uffd-wp special cases for fork()
> > > >
> > >
> > > That commit appeared in mainline between v5.13 and v5.14-rc1. Is mainline
> > > also affected? e.g. v5.14-rc4.
> >
> > 5.14-rc4 boots fine. I suspect the commit has some dependency that
> > hasn't been backported to 5.10 stable.
> >
>
> On mainline, 9e1cf2d1ed3 is known as commit 8f34f1eac382 ("mm/userfaultfd:
> fix uffd-wp special cases for fork()").
>
> There are differences between the two commits that may be relevant. I
> don't know.
>
> If you checkout 8f34f1eac382 and if that works, it would indicate either
> missing dependencies in -stable, or those differences are important.
>
> OTOH, if 8f34f1eac382 fails in the same way as linux-5.10.y, it would
> indicate that -stable is missing a fix that's present in v5.14-rc4.

8f34f1eac382 boots fine.

2021-08-09 14:17:47

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [BISECTED][REGRESSION] 5.10.56 longterm kernel breakage on m68k/aranym

CC Mike

On Mon, Aug 9, 2021 at 3:32 PM Mikael Pettersson <[email protected]> wrote:
> On Mon, Aug 9, 2021 at 3:59 AM Finn Thain <[email protected]> wrote:
> > On Sun, 8 Aug 2021, Mikael Pettersson wrote:
> > > On Sun, Aug 8, 2021 at 1:20 AM Finn Thain <[email protected]> wrote:
> > > > On Sat, 7 Aug 2021, Mikael Pettersson wrote:
> > > > > I updated the 5.10 longterm kernel on one of my m68k/aranym VMs from
> > > > > 5.10.47 to 5.10.56, and the new kernel failed to boot:
> > > > >
> > > > > ARAnyM 1.1.0
> > > > > Using config file: 'aranym1.headless.config'
> > > > > Could not open joystick 0
> > > > > ARAnyM RTC Timer: /dev/rtc: Permission denied
> > > > > ARAnyM LILO: Error loading ramdisk 'root.bin'
> > > > > Blitter tried to read byte from register ff8a00 at 0077ee
> > > > >
> > > > > At this point it kept running, but produced no output to the console,
> > > > > and would never get to the point of starting user-space. Attaching gdb
> > > > > to aranym showed nothing interesting, i.e. it seemed to be executing
> > > > > normally.
> > > > >
> > > > > A git bisect identified the following commit between 5.10.52 and
> > > > > 5.10.53 as the culprit:
> > > > > # first bad commit: [9e1cf2d1ed37c934c9935f2c0b2f8b15d9355654]
> > > > > mm/userfaultfd: fix uffd-wp special cases for fork()
> > > > >
> > > >
> > > > That commit appeared in mainline between v5.13 and v5.14-rc1. Is mainline
> > > > also affected? e.g. v5.14-rc4.
> > >
> > > 5.14-rc4 boots fine. I suspect the commit has some dependency that
> > > hasn't been backported to 5.10 stable.
> > >
> >
> > On mainline, 9e1cf2d1ed3 is known as commit 8f34f1eac382 ("mm/userfaultfd:
> > fix uffd-wp special cases for fork()").
> >
> > There are differences between the two commits that may be relevant. I
> > don't know.
> >
> > If you checkout 8f34f1eac382 and if that works, it would indicate either
> > missing dependencies in -stable, or those differences are important.
> >
> > OTOH, if 8f34f1eac382 fails in the same way as linux-5.10.y, it would
> > indicate that -stable is missing a fix that's present in v5.14-rc4.
>
> My initial bisect was wrong. I tried reverting 8f34f1eac382 from
> 5.10.57 but that made no difference, so I re-ran the git bisect with
> all known good points pre-marked. This landed on:
> # first bad commit: [ce6ee46e0f39ed97e23ebf7b5a565e0266a8a1a3]
> mm/page_alloc: fix memory map initialization for descending nodes
>
> Reverting _that_ from 5.10.57 does unbreak that kernel.
>
> Sorry about the confusion.

2021-08-09 15:16:35

by Mikael Pettersson

[permalink] [raw]
Subject: Re: [BISECTED][REGRESSION] 5.10.56 longterm kernel breakage on m68k/aranym

On Mon, Aug 9, 2021 at 3:59 AM Finn Thain <[email protected]> wrote:
>
> On Sun, 8 Aug 2021, Mikael Pettersson wrote:
>
> > On Sun, Aug 8, 2021 at 1:20 AM Finn Thain <[email protected]> wrote:
> > >
> > > On Sat, 7 Aug 2021, Mikael Pettersson wrote:
> > >
> > > > I updated the 5.10 longterm kernel on one of my m68k/aranym VMs from
> > > > 5.10.47 to 5.10.56, and the new kernel failed to boot:
> > > >
> > > > ARAnyM 1.1.0
> > > > Using config file: 'aranym1.headless.config'
> > > > Could not open joystick 0
> > > > ARAnyM RTC Timer: /dev/rtc: Permission denied
> > > > ARAnyM LILO: Error loading ramdisk 'root.bin'
> > > > Blitter tried to read byte from register ff8a00 at 0077ee
> > > >
> > > > At this point it kept running, but produced no output to the console,
> > > > and would never get to the point of starting user-space. Attaching gdb
> > > > to aranym showed nothing interesting, i.e. it seemed to be executing
> > > > normally.
> > > >
> > > > A git bisect identified the following commit between 5.10.52 and
> > > > 5.10.53 as the culprit:
> > > > # first bad commit: [9e1cf2d1ed37c934c9935f2c0b2f8b15d9355654]
> > > > mm/userfaultfd: fix uffd-wp special cases for fork()
> > > >
> > >
> > > That commit appeared in mainline between v5.13 and v5.14-rc1. Is mainline
> > > also affected? e.g. v5.14-rc4.
> >
> > 5.14-rc4 boots fine. I suspect the commit has some dependency that
> > hasn't been backported to 5.10 stable.
> >
>
> On mainline, 9e1cf2d1ed3 is known as commit 8f34f1eac382 ("mm/userfaultfd:
> fix uffd-wp special cases for fork()").
>
> There are differences between the two commits that may be relevant. I
> don't know.
>
> If you checkout 8f34f1eac382 and if that works, it would indicate either
> missing dependencies in -stable, or those differences are important.
>
> OTOH, if 8f34f1eac382 fails in the same way as linux-5.10.y, it would
> indicate that -stable is missing a fix that's present in v5.14-rc4.

My initial bisect was wrong. I tried reverting 8f34f1eac382 from
5.10.57 but that made no difference, so I re-ran the git bisect with
all known good points pre-marked. This landed on:
# first bad commit: [ce6ee46e0f39ed97e23ebf7b5a565e0266a8a1a3]
mm/page_alloc: fix memory map initialization for descending nodes

Reverting _that_ from 5.10.57 does unbreak that kernel.

Sorry about the confusion.

2021-08-10 16:45:42

by Mike Rapoport

[permalink] [raw]
Subject: Re: [BISECTED][REGRESSION] 5.10.56 longterm kernel breakage on m68k/aranym

On Mon, Aug 09, 2021 at 03:40:04PM +0200, Geert Uytterhoeven wrote:
> CC Mike
>
> On Mon, Aug 9, 2021 at 3:32 PM Mikael Pettersson <[email protected]> wrote:
> > On Mon, Aug 9, 2021 at 3:59 AM Finn Thain <[email protected]> wrote:
> > > On Sun, 8 Aug 2021, Mikael Pettersson wrote:
> > > > On Sun, Aug 8, 2021 at 1:20 AM Finn Thain <[email protected]> wrote:
> > > > > On Sat, 7 Aug 2021, Mikael Pettersson wrote:
> > > > > > I updated the 5.10 longterm kernel on one of my m68k/aranym VMs from
> > > > > > 5.10.47 to 5.10.56, and the new kernel failed to boot:
> > > > > >
> > > > > > ARAnyM 1.1.0
> > > > > > Using config file: 'aranym1.headless.config'
> > > > > > Could not open joystick 0
> > > > > > ARAnyM RTC Timer: /dev/rtc: Permission denied
> > > > > > ARAnyM LILO: Error loading ramdisk 'root.bin'
> > > > > > Blitter tried to read byte from register ff8a00 at 0077ee
> > > > > >
> > > > > > At this point it kept running, but produced no output to the console,
> > > > > > and would never get to the point of starting user-space. Attaching gdb
> > > > > > to aranym showed nothing interesting, i.e. it seemed to be executing
> > > > > > normally.
> >
> > My initial bisect was wrong. I tried reverting 8f34f1eac382 from
> > 5.10.57 but that made no difference, so I re-ran the git bisect with
> > all known good points pre-marked. This landed on:
> > # first bad commit: [ce6ee46e0f39ed97e23ebf7b5a565e0266a8a1a3]
> > mm/page_alloc: fix memory map initialization for descending nodes
> >
> > Reverting _that_ from 5.10.57 does unbreak that kernel.

Indeed there is a problem with that commit in 5.10. The memmap
initialization relies on availability of zone_to_nid() to link struct page
to a node. But in 5.10 zone_to_nid() is only defined for NUMA, but not for
DISCONTIGMEM.

Mikael, can you please try the patch below:

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 9d0c454d23cd..63b550403317 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -445,7 +445,7 @@ struct zone {
*/
long lowmem_reserve[MAX_NR_ZONES];

-#ifdef CONFIG_NUMA
+#ifdef CONFIG_NEED_MULTIPLE_NODES
int node;
#endif
struct pglist_data *zone_pgdat;
@@ -896,7 +896,7 @@ static inline bool populated_zone(struct zone *zone)
return zone->present_pages;
}

-#ifdef CONFIG_NUMA
+#ifdef CONFIG_NEED_MULTIPLE_NODES
static inline int zone_to_nid(struct zone *zone)
{
return zone->node;

--
Sincerely yours,
Mike.

2021-08-11 09:27:22

by Mikael Pettersson

[permalink] [raw]
Subject: Re: [BISECTED][REGRESSION] 5.10.56 longterm kernel breakage on m68k/aranym

On Tue, Aug 10, 2021 at 5:59 PM Mike Rapoport <[email protected]> wrote:
>
> On Mon, Aug 09, 2021 at 03:40:04PM +0200, Geert Uytterhoeven wrote:
> > CC Mike
> >
> > On Mon, Aug 9, 2021 at 3:32 PM Mikael Pettersson <[email protected]> wrote:
> > > On Mon, Aug 9, 2021 at 3:59 AM Finn Thain <[email protected]> wrote:
> > > > On Sun, 8 Aug 2021, Mikael Pettersson wrote:
> > > > > On Sun, Aug 8, 2021 at 1:20 AM Finn Thain <[email protected]> wrote:
> > > > > > On Sat, 7 Aug 2021, Mikael Pettersson wrote:
> > > > > > > I updated the 5.10 longterm kernel on one of my m68k/aranym VMs from
> > > > > > > 5.10.47 to 5.10.56, and the new kernel failed to boot:
> > > > > > >
> > > > > > > ARAnyM 1.1.0
> > > > > > > Using config file: 'aranym1.headless.config'
> > > > > > > Could not open joystick 0
> > > > > > > ARAnyM RTC Timer: /dev/rtc: Permission denied
> > > > > > > ARAnyM LILO: Error loading ramdisk 'root.bin'
> > > > > > > Blitter tried to read byte from register ff8a00 at 0077ee
> > > > > > >
> > > > > > > At this point it kept running, but produced no output to the console,
> > > > > > > and would never get to the point of starting user-space. Attaching gdb
> > > > > > > to aranym showed nothing interesting, i.e. it seemed to be executing
> > > > > > > normally.
> > >
> > > My initial bisect was wrong. I tried reverting 8f34f1eac382 from
> > > 5.10.57 but that made no difference, so I re-ran the git bisect with
> > > all known good points pre-marked. This landed on:
> > > # first bad commit: [ce6ee46e0f39ed97e23ebf7b5a565e0266a8a1a3]
> > > mm/page_alloc: fix memory map initialization for descending nodes
> > >
> > > Reverting _that_ from 5.10.57 does unbreak that kernel.
>
> Indeed there is a problem with that commit in 5.10. The memmap
> initialization relies on availability of zone_to_nid() to link struct page
> to a node. But in 5.10 zone_to_nid() is only defined for NUMA, but not for
> DISCONTIGMEM.
>
> Mikael, can you please try the patch below:
>
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 9d0c454d23cd..63b550403317 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -445,7 +445,7 @@ struct zone {
> */
> long lowmem_reserve[MAX_NR_ZONES];
>
> -#ifdef CONFIG_NUMA
> +#ifdef CONFIG_NEED_MULTIPLE_NODES
> int node;
> #endif
> struct pglist_data *zone_pgdat;
> @@ -896,7 +896,7 @@ static inline bool populated_zone(struct zone *zone)
> return zone->present_pages;
> }
>
> -#ifdef CONFIG_NUMA
> +#ifdef CONFIG_NEED_MULTIPLE_NODES
> static inline int zone_to_nid(struct zone *zone)
> {
> return zone->node;
>
> --
> Sincerely yours,
> Mike.

Applying this on top of 5.10.57 fixes the problem, thanks.

Tested-by: Mikael Pettersson <[email protected]>