2023-10-04 18:03:11

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.15 000/183] 5.15.134-rc1 review

This is the start of the stable review cycle for the 5.15.134 release.
There are 183 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.

Responses should be made by Fri, 06 Oct 2023 17:51:12 +0000.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.134-rc1.gz
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <[email protected]>
Linux 5.15.134-rc1

Florian Westphal <[email protected]>
netfilter: nf_tables: fix kdoc warnings after gc rework

Jani Nikula <[email protected]>
drm/meson: fix memory leak on ->hpd_notify callback

Greg Ungerer <[email protected]>
fs: binfmt_elf_efpic: fix personality for ELF-FDPIC

Matthias Schiffer <[email protected]>
ata: libata-sata: increase PMP SRST timeout to 10s

Damien Le Moal <[email protected]>
ata: libata-core: Do not register PM operations for SAS ports

Damien Le Moal <[email protected]>
ata: libata-core: Fix port and device removal

Damien Le Moal <[email protected]>
ata: libata-core: Fix ata_port_request_pm() locking

Mika Westerberg <[email protected]>
net: thunderbolt: Fix TCPv6 GSO checksum calculation

Nick Desaulniers <[email protected]>
bpf: Fix BTF_ID symbol generation collision in tools/

Jiri Olsa <[email protected]>
bpf: Fix BTF_ID symbol generation collision

Josef Bacik <[email protected]>
btrfs: properly report 0 avail for very full file systems

Steven Rostedt (Google) <[email protected]>
ring-buffer: Update "shortest_full" in polling

Ben Wolsieffer <[email protected]>
proc: nommu: /proc/<pid>/maps: release mmap read lock

Trond Myklebust <[email protected]>
Revert "SUNRPC dont update timeout value on connection reset"

Jens Axboe <[email protected]>
io_uring/fs: remove sqe->rw_flags checking from LINKAT

Joel Fernandes (Google) <[email protected]>
sched/rt: Fix live lock between select_fallback_rq() and RT push

Liam R. Howlett <[email protected]>
kernel/sched: Modify initial boot task idle setup

Heiner Kallweit <[email protected]>
i2c: i801: unregister tco_pdev in i801_probe() error path

Niklas Cassel <[email protected]>
ata: libata-scsi: ignore reserved bits for REPORT SUPPORTED OPERATION CODES

Kailang Yang <[email protected]>
ALSA: hda: Disable power save for solving pop issue on Lenovo ThinkCentre M70q

Pablo Neira Ayuso <[email protected]>
netfilter: nf_tables: disallow rule removal from chain binding

Pan Bian <[email protected]>
nilfs2: fix potential use after free in nilfs_gccache_submit_read_data()

Andy Shevchenko <[email protected]>
serial: 8250_port: Check IRQ data before use

Daniel Starke <[email protected]>
Revert "tty: n_gsm: fix UAF in gsm_cleanup_mux"

Ricky WU <[email protected]>
misc: rtsx: Fix some platforms can not boot and move the l1ss judgment to probe

Pu Wen <[email protected]>
x86/srso: Add SRSO mitigation for Hygon processors

Nicolin Chen <[email protected]>
iommu/arm-smmu-v3: Fix soft lockup triggered by arm_smmu_mm_invalidate_range

Vishal Goel <[email protected]>
Smack:- Use overlay inode label in smack_inode_copy_up()

Roberto Sassu <[email protected]>
smack: Retrieve transmuting information in smack_inode_getsecurity()

Roberto Sassu <[email protected]>
smack: Record transmuting in smk_transmuted

Irvin Cote <[email protected]>
nvme-pci: always return an ERR_PTR from nvme_pci_alloc_dev

Gleb Chesnokov <[email protected]>
scsi: qla2xxx: Fix NULL pointer dereference in target mode

Ian Rogers <[email protected]>
perf metric: Return early if no CPU PMU table exists

Andrii Staikov <[email protected]>
i40e: fix potential NULL pointer dereferencing of pf->vf i40e_sync_vsi_filters()

Mika Westerberg <[email protected]>
watchdog: iTCO_wdt: Set NO_REBOOT if the watchdog is not already running

Mika Westerberg <[email protected]>
watchdog: iTCO_wdt: No need to stop the timer in probe

Pratyush Yadav <[email protected]>
nvme-pci: do not set the NUMA node of device if it has none

Christoph Hellwig <[email protected]>
nvme-pci: factor out a nvme_pci_alloc_dev helper

Christoph Hellwig <[email protected]>
nvme-pci: factor the iod mempool creation into a helper

Chengming Zhou <[email protected]>
cgroup: Fix suspicious rcu_dereference_check() usage warning

Chengming Zhou <[email protected]>
sched/cpuacct: Optimize away RCU read lock

Arnaldo Carvalho de Melo <[email protected]>
perf build: Define YYNOMEM as YYNOABORT for bison < 3.81

Thomas Zimmermann <[email protected]>
fbdev/sh7760fb: Depend on FB=y

Johnathan Mantey <[email protected]>
ncsi: Propagate carrier gain/loss events to the NCSI controller

Benjamin Gray <[email protected]>
powerpc/watchpoints: Annotate atomic context in more places

Benjamin Gray <[email protected]>
powerpc/watchpoint: Disable pagefaults when getting user instruction

Benjamin Gray <[email protected]>
powerpc/watchpoints: Disable preemption in thread_change_pc()

Hans Verkuil <[email protected]>
media: vb2: frame_vector.c: replace WARN_ONCE with a comment

Chancel Liu <[email protected]>
ASoC: imx-rpmsg: Set ignore_pmdown_time for dai_link

Stanislav Fomichev <[email protected]>
bpf: Clarify error expectations from bpf_clone_redirect

Shengjiu Wang <[email protected]>
ASoC: fsl: imx-pcm-rpmsg: Add SNDRV_PCM_INFO_BATCH flag

Valentin Caron <[email protected]>
spi: stm32: add a delay before SPI disable

Han Xu <[email protected]>
spi: nxp-fspi: reset the FLSHxCR1 registers

Niklas Cassel <[email protected]>
ata: libata-eh: do not clear ATA_PFLAG_EH_PENDING in ata_eh_reset()

Steve French <[email protected]>
smb3: correct places where ENOTSUPP is used instead of preferred EOPNOTSUPP

Michal Grzedzicki <[email protected]>
scsi: pm80xx: Avoid leaking tags when processing OPC_INB_SET_CONTROLLER_CONFIG command

Michal Grzedzicki <[email protected]>
scsi: pm80xx: Use phy-specific SAS address when sending PHY_START command

David Francis <[email protected]>
drm/amdgpu: Handle null atom context in VBIOS info ioctl

Swapnil Patel <[email protected]>
drm/amd/display: Don't check registers, if using AUX BL control

David Thompson <[email protected]>
platform/mellanox: mlxbf-bootctl: add NET dependency into Kconfig

Steven Rostedt (Google) <[email protected]>
ring-buffer: Do not attempt to read past "commit"

Ricardo B. Marliere <[email protected]>
selftests: fix dependency checker script

Filipe Manana <[email protected]>
btrfs: improve error message after failure to add delayed dir index item

Zheng Yejian <[email protected]>
ring-buffer: Avoid softlockup in ring_buffer_resize()

Zheng Yejian <[email protected]>
selftests/ftrace: Correctly enable event in instance-event.tc

Kiwoong Kim <[email protected]>
scsi: ufs: core: Move __ufshcd_send_uic_cmd() outside host_lock

Javed Hasan <[email protected]>
scsi: qedf: Add synchronization between I/O completions and abort

Helge Deller <[email protected]>
parisc: irq: Make irq_stack_union static to avoid sparse warning

Helge Deller <[email protected]>
parisc: drivers: Fix sparse warning

Helge Deller <[email protected]>
parisc: iosapic.c: Fix sparse warnings

Helge Deller <[email protected]>
parisc: sba: Fix compile warning wrt list of SBA devices

Tobias Schramm <[email protected]>
spi: sun6i: fix race between DMA RX transfer completion and RX FIFO drain

Tobias Schramm <[email protected]>
spi: sun6i: reduce DMA RX transfer width to single byte

Sergey Senozhatsky <[email protected]>
dma-debug: don't call __dma_entry_alloc_check_leak() under free_entries_lock

William A. Kennington III <[email protected]>
i2c: npcm7xx: Fix callback completion ordering

Wenhua Lin <[email protected]>
gpio: pmic-eic-sprd: Add can_sleep flag for PMIC EIC chip

Nathan Rossi <[email protected]>
soc: imx8m: Enable OCOTP clock for imx8mm before reading registers

Max Filippov <[email protected]>
xtensa: boot/lib: fix function prototypes

Randy Dunlap <[email protected]>
xtensa: boot: don't add include-dirs

Randy Dunlap <[email protected]>
xtensa: iss/network: make functions static

Max Filippov <[email protected]>
xtensa: add default definition for XCHAL_HAVE_DIV32

Christophe JAILLET <[email protected]>
firmware: imx-dsp: Fix an error handling path in imx_dsp_setup_channels()

Dan Carpenter <[email protected]>
power: supply: ucs1002: fix error code in ucs1002_get_property()

Tony Lindgren <[email protected]>
bus: ti-sysc: Fix SYSC_QUIRK_SWSUP_SIDLE_ACT handling for uart wake-up

Tony Lindgren <[email protected]>
ARM: dts: ti: omap: motorola-mapphone: Fix abe_clkctrl warning on boot

Tony Lindgren <[email protected]>
ARM: dts: ti: omap: Fix bandgap thermal cells addressing for omap3/4

Krzysztof Kozlowski <[email protected]>
ARM: dts: omap: correct indentation

Thomas Gleixner <[email protected]>
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_56.RULE (part 1)

Timo Alho <[email protected]>
clk: tegra: fix error return case for recalc_rate

Adam Ford <[email protected]>
bus: ti-sysc: Fix missing AM35xx SoC matching

Julien Panis <[email protected]>
bus: ti-sysc: Use fsleep() instead of usleep_range() in sysc_reset()

Marek Vasut <[email protected]>
drm/bridge: ti-sn65dsi83: Do not generate HFP/HBP/HSA and EOT packet

Christoph Hellwig <[email protected]>
MIPS: Alchemy: only build mmc support helpers if au1xmmc is enabled

Qu Wenruo <[email protected]>
btrfs: reset destination buffer when read_extent_buffer() gets invalid range

Nilesh Javali <[email protected]>
scsi: qla2xxx: Use raw_smp_processor_id() instead of smp_processor_id()

Shreyas Deodhar <[email protected]>
scsi: qla2xxx: Select qpair depending on which CPU post_cmd() gets called

Werner Fischer <[email protected]>
ata: ahci: Add Elkhart Lake AHCI controller

Mario Limonciello <[email protected]>
ata: ahci: Rename board_ahci_mobile

Paul Menzel <[email protected]>
ata: ahci: Add support for AMD A85 FCH (Hudson D4)

Paul Menzel <[email protected]>
ata: libata: Rename link flag ATA_LFLAG_NO_DB_DELAY

Xiao Liang <[email protected]>
netfilter: nft_exthdr: Fix non-linear header modification

Florian Westphal <[email protected]>
netfilter: exthdr: add support for tcp option removal

Namhyung Kim <[email protected]>
perf build: Update build rule for generated files

Ian Rogers <[email protected]>
perf jevents: Switch build to use jevents.py

Werner Sembach <[email protected]>
Input: i8042 - add quirk for TUXEDO Gemini 17 Gen1/Clevo PD70PN

Huacai Chen <[email protected]>
Input: i8042 - rename i8042-x86ia64io.h to i8042-acpipnpio.h

Darrick J. Wong <[email protected]>
xfs: fix xfs_inodegc_stop racing with mod_delayed_work

Darrick J. Wong <[email protected]>
xfs: disable reaping in fscounters scrub

Darrick J. Wong <[email protected]>
xfs: check that per-cpu inodegc workers actually run on that cpu

Darrick J. Wong <[email protected]>
xfs: explicitly specify cpu when forcing inodegc delayed work to run immediately

Dave Chinner <[email protected]>
xfs: introduce xfs_inodegc_push()

Dave Chinner <[email protected]>
xfs: bound maximum wait time for inodegc work

Liang He <[email protected]>
i2c: mux: gpio: Add missing fwnode_handle_put()

Andy Shevchenko <[email protected]>
i2c: mux: gpio: Replace custom acpi_get_local_address()

Xiaoke Wang <[email protected]>
i2c: mux: demux-pinctrl: check the return value of devm_kstrdup()

Christophe JAILLET <[email protected]>
gpio: tb10x: Fix an error handling path in tb10x_gpio_probe()

Sasha Levin <[email protected]>
Fix up backport of 136191703038 ("interconnect: Teach lockdep about icc_bw_lock order")

Muhammad Husaini Zulkifli <[email protected]>
igc: Expose tx-usecs coalesce setting to user

Sebastian Andrzej Siewior <[email protected]>
bnxt_en: Flush XDP for bnxt_poll_nitroa0()'s NAPI

Sebastian Andrzej Siewior <[email protected]>
net: ena: Flush XDP packets on error.

Sebastian Andrzej Siewior <[email protected]>
locking/seqlock: Do the lockdep annotation before locking in do_write_seqcount_begin_nested()

Jozsef Kadlecsik <[email protected]>
netfilter: ipset: Fix race between IPSET_CMD_CREATE and IPSET_CMD_SWAP

Florian Westphal <[email protected]>
netfilter: nf_tables: disable toggling dormant table state more than once

Artem Chernyshev <[email protected]>
net: rds: Fix possible NULL-pointer dereference

Ziyang Xuan <[email protected]>
team: fix null-ptr-deref when team device type is changed

Eric Dumazet <[email protected]>
net: bridge: use DEV_STATS_INC()

Jie Wang <[email protected]>
net: hns3: add 5ms delay before clear firmware reset irq source

Jijie Shao <[email protected]>
net: hns3: fix fail to delete tc flower rules during reset issue

Jian Shen <[email protected]>
net: hns3: only enable unicast promisc when mac table full

Jie Wang <[email protected]>
net: hns3: fix GRE checksum offload issue

Josh Poimboeuf <[email protected]>
x86/srso: Fix SBPB enablement for spec_rstack_overflow=off

Josh Poimboeuf <[email protected]>
x86/srso: Fix srso_show_state() side effect

Stephen Boyd <[email protected]>
platform/x86: intel_scu_ipc: Fail IPC send if still busy

Stephen Boyd <[email protected]>
platform/x86: intel_scu_ipc: Don't override scu in intel_scu_ipc_dev_simple_command()

Stephen Boyd <[email protected]>
platform/x86: intel_scu_ipc: Check status upon timeout in ipc_wait_for_interrupt()

Stephen Boyd <[email protected]>
platform/x86: intel_scu_ipc: Check status after timeout in busy_loop()

Eric Dumazet <[email protected]>
dccp: fix dccp_v4_err()/dccp_v6_err() again

Kajol Jain <[email protected]>
powerpc/perf/hv-24x7: Update domain value check

Kyle Zeng <[email protected]>
ipv4: fix null-deref in ipv4_link_failure

Vinicius Costa Gomes <[email protected]>
igc: Fix infinite initialization loop with early XDP redirect

David Christensen <[email protected]>
ionic: fix 16bit math issue when PAGE_SIZE >= 64KB

Ivan Vecera <[email protected]>
i40e: Fix VF VLAN offloading when port VLAN is configured

Mateusz Palczewski <[email protected]>
i40e: Add VF VLAN pruning

Radoslaw Tyl <[email protected]>
iavf: do not process adminq tasks when __IAVF_IN_REMOVE_TASK is set

Shengjiu Wang <[email protected]>
ASoC: imx-audmix: Fix return error with devm_clk_get()

Sasha Neftin <[email protected]>
net/core: Fix ETH_P_1588 flow dissector

Sabrina Dubroca <[email protected]>
selftests: tls: swap the TX and RX sockets in some tests

Toke Høiland-Jørgensen <[email protected]>
bpf: Avoid deadlock when using queue and stack maps from NMI

Pablo Neira Ayuso <[email protected]>
netfilter: nf_tables: disallow element removal on anonymous sets

Jerome Brunet <[email protected]>
ASoC: meson: spdifin: start hw on dai probe

Florian Westphal <[email protected]>
netfilter: nf_tables: fix memleak when more than 255 elements expired

Pablo Neira Ayuso <[email protected]>
netfilter: nft_set_hash: try later when GC hits EAGAIN on iteration

Pablo Neira Ayuso <[email protected]>
netfilter: nft_set_pipapo: stop GC iteration if GC transaction allocation fails

Pablo Neira Ayuso <[email protected]>
netfilter: nft_set_pipapo: call nft_trans_gc_queue_sync() in catchall GC

Pablo Neira Ayuso <[email protected]>
netfilter: nft_set_rbtree: use read spinlock to avoid datapath contention

Pablo Neira Ayuso <[email protected]>
netfilter: nft_set_rbtree: skip sync GC for new elements in this transaction

Florian Westphal <[email protected]>
netfilter: nf_tables: defer gc run if previous batch is still pending

Pablo Neira Ayuso <[email protected]>
netfilter: nf_tables: use correct lock to protect gc_list

Pablo Neira Ayuso <[email protected]>
netfilter: nf_tables: GC transaction race with abort path

Pablo Neira Ayuso <[email protected]>
netfilter: nf_tables: GC transaction race with netns dismantle

Pablo Neira Ayuso <[email protected]>
netfilter: nf_tables: fix GC transaction races with netns and netlink event exit path

Florian Westphal <[email protected]>
netfilter: nf_tables: don't fail inserts if duplicate has expired

Pablo Neira Ayuso <[email protected]>
netfilter: nf_tables: remove busy mark and gc batch API

Pablo Neira Ayuso <[email protected]>
netfilter: nft_set_hash: mark set element as dead when deleting from packet path

Pablo Neira Ayuso <[email protected]>
netfilter: nf_tables: adapt set backend to use GC transaction API

Pablo Neira Ayuso <[email protected]>
netfilter: nf_tables: GC transaction API to avoid race with control plane

Florian Westphal <[email protected]>
netfilter: nf_tables: don't skip expired elements during walk

Steven Rostedt (Google) <[email protected]>
tracing: Have event inject files inc the trace array ref count

Jan Kara <[email protected]>
ext4: do not let fstrim block system suspend

Jan Kara <[email protected]>
ext4: move setting of trimmed bit into ext4_try_to_trim_range()

Kemeng Shi <[email protected]>
ext4: replace the traditional ternary conditional operator with with max()/min()

Lukas Czerner <[email protected]>
ext4: change s_last_trim_minblks type to unsigned long

Lukas Bulwahn <[email protected]>
ext4: scope ret locally in ext4_try_to_trim_range()

Szuying Chen <[email protected]>
ata: libahci: clear pending interrupt status

Hannes Reinecke <[email protected]>
ata: ahci: Drop pointless VPRINTK() calls and convert the remaining ones

Steven Rostedt (Google) <[email protected]>
tracing: Increase trace array ref count on enable and filter files

John Keeping <[email protected]>
tracing: Make trace_marker{,_raw} stream-like

Olga Kornievskaia <[email protected]>
NFSv4.1: fix pnfs MDS=DS session trunking

Olga Kornievskaia <[email protected]>
NFSv4.1: use EXCHGID4_FLAG_USE_PNFS_DS for DS server

Trond Myklebust <[email protected]>
SUNRPC: Mark the cred for revalidation if the server rejects it

Trond Myklebust <[email protected]>
NFS/pNFS: Report EINVAL errors from connect() to the server

Trond Myklebust <[email protected]>
NFS: More fixes for nfs_direct_write_reschedule_io()

Trond Myklebust <[email protected]>
NFS: Use the correct commit info in nfs_join_page_group()


-------------

Diffstat:

Makefile | 4 +-
arch/arm/boot/dts/am33xx.dtsi | 5 +-
arch/arm/boot/dts/am3517.dtsi | 5 +-
arch/arm/boot/dts/am4372.dtsi | 5 +-
arch/arm/boot/dts/artpec6-devboard.dts | 9 +-
arch/arm/boot/dts/dm814x.dtsi | 6 +-
arch/arm/boot/dts/dm816x.dtsi | 6 +-
arch/arm/boot/dts/dra62x.dtsi | 6 +-
arch/arm/boot/dts/dra7-dspeve-thermal.dtsi | 5 +-
arch/arm/boot/dts/dra7-iva-thermal.dtsi | 5 +-
arch/arm/boot/dts/imx6q-gk802.dts | 9 +-
arch/arm/boot/dts/motorola-mapphone-common.dtsi | 4 +-
arch/arm/boot/dts/omap-gpmc-smsc911x.dtsi | 6 +-
arch/arm/boot/dts/omap-gpmc-smsc9221.dtsi | 6 +-
arch/arm/boot/dts/omap2.dtsi | 5 +-
arch/arm/boot/dts/omap2420.dtsi | 5 +-
arch/arm/boot/dts/omap2430.dtsi | 5 +-
arch/arm/boot/dts/omap3-cm-t3517.dts | 12 +-
arch/arm/boot/dts/omap3-cpu-thermal.dtsi | 8 +-
arch/arm/boot/dts/omap3-gta04.dtsi | 6 +-
arch/arm/boot/dts/omap3-ldp.dts | 2 +-
arch/arm/boot/dts/omap3-n900.dts | 38 +-
arch/arm/boot/dts/omap3-zoom3.dts | 44 +--
arch/arm/boot/dts/omap3.dtsi | 5 +-
arch/arm/boot/dts/omap34xx.dtsi | 5 +-
arch/arm/boot/dts/omap36xx.dtsi | 5 +-
arch/arm/boot/dts/omap4-cpu-thermal.dtsi | 34 +-
arch/arm/boot/dts/omap443x.dtsi | 6 +-
arch/arm/boot/dts/omap4460.dtsi | 6 +-
arch/arm/boot/dts/omap5-cm-t54.dts | 56 +--
arch/arm/boot/dts/omap5-core-thermal.dtsi | 5 +-
arch/arm/boot/dts/omap5-gpu-thermal.dtsi | 5 +-
arch/arm/boot/dts/orion5x-lacie-d2-network.dts | 5 +-
.../dts/orion5x-lacie-ethernet-disk-mini-v2.dts | 9 +-
.../boot/dts/orion5x-maxtor-shared-storage-2.dts | 5 +-
arch/arm/boot/dts/orion5x-mv88f5181.dtsi | 9 +-
arch/arm/boot/dts/orion5x-mv88f5182.dtsi | 9 +-
arch/arm/boot/dts/orion5x-netgear-wnr854t.dts | 9 +-
arch/arm/boot/dts/orion5x-rd88f5182-nas.dts | 9 +-
arch/arm/boot/dts/orion5x.dtsi | 9 +-
arch/arm/include/asm/hardware/cache-aurora-l2.h | 5 +-
arch/arm/include/asm/hardware/cache-feroceon-l2.h | 6 +-
arch/arm/include/asm/hardware/cache-tauros2.h | 5 +-
arch/arm/mach-davinci/board-da830-evm.c | 6 +-
arch/arm/mach-davinci/board-da850-evm.c | 6 +-
arch/arm/mach-davinci/board-dm355-evm.c | 6 +-
arch/arm/mach-davinci/board-dm355-leopard.c | 5 +-
arch/arm/mach-davinci/board-dm644x-evm.c | 6 +-
arch/arm/mach-davinci/board-dm646x-evm.c | 7 +-
arch/arm/mach-davinci/board-mityomapl138.c | 5 +-
arch/arm/mach-davinci/board-neuros-osd2.c | 5 +-
arch/arm/mach-davinci/board-omapl138-hawk.c | 5 +-
arch/arm/mach-davinci/common.c | 6 +-
arch/arm/mach-davinci/cpuidle.h | 5 +-
arch/arm/mach-davinci/da830.c | 6 +-
arch/arm/mach-davinci/da850.c | 6 +-
arch/arm/mach-davinci/dm355.c | 6 +-
arch/arm/mach-davinci/dm644x.c | 6 +-
arch/arm/mach-davinci/dm646x.c | 6 +-
arch/arm/mach-davinci/include/mach/common.h | 6 +-
arch/arm/mach-davinci/include/mach/cputype.h | 6 +-
arch/arm/mach-davinci/include/mach/da8xx.h | 6 +-
arch/arm/mach-davinci/include/mach/hardware.h | 6 +-
arch/arm/mach-davinci/include/mach/serial.h | 6 +-
arch/arm/mach-davinci/mux.c | 6 +-
arch/arm/mach-davinci/mux.h | 6 +-
arch/arm/mach-davinci/pm_domain.c | 5 +-
arch/arm/mach-dove/bridge-regs.h | 9 +-
arch/arm/mach-dove/cm-a510.c | 5 +-
arch/arm/mach-dove/common.c | 5 +-
arch/arm/mach-dove/common.h | 5 +-
arch/arm/mach-dove/dove-db-setup.c | 5 +-
arch/arm/mach-dove/dove.h | 9 +-
arch/arm/mach-dove/irq.c | 5 +-
arch/arm/mach-dove/irqs.h | 9 +-
arch/arm/mach-dove/mpp.c | 5 +-
arch/arm/mach-dove/pcie.c | 5 +-
arch/arm/mach-dove/pm.h | 6 +-
arch/arm/mach-lpc18xx/board-dt.c | 5 +-
arch/arm/mach-lpc32xx/pm.c | 6 +-
arch/arm/mach-lpc32xx/suspend.S | 6 +-
arch/arm/mach-mv78xx0/bridge-regs.h | 6 +-
arch/arm/mach-mv78xx0/buffalo-wxl-setup.c | 5 +-
arch/arm/mach-mv78xx0/common.c | 5 +-
arch/arm/mach-mv78xx0/common.h | 5 +-
arch/arm/mach-mv78xx0/db78x00-bp-setup.c | 5 +-
arch/arm/mach-mv78xx0/irq.c | 5 +-
arch/arm/mach-mv78xx0/irqs.h | 9 +-
arch/arm/mach-mv78xx0/mpp.c | 5 +-
arch/arm/mach-mv78xx0/mpp.h | 6 +-
arch/arm/mach-mv78xx0/mv78xx0.h | 5 +-
arch/arm/mach-mv78xx0/pcie.c | 5 +-
arch/arm/mach-mv78xx0/rd78x00-masa-setup.c | 5 +-
arch/arm/mach-mvebu/armada-370-xp.h | 5 +-
arch/arm/mach-mvebu/board-v7.c | 5 +-
arch/arm/mach-mvebu/coherency.c | 5 +-
arch/arm/mach-mvebu/coherency.h | 6 +-
arch/arm/mach-mvebu/coherency_ll.S | 5 +-
arch/arm/mach-mvebu/common.h | 5 +-
arch/arm/mach-mvebu/cpu-reset.c | 5 +-
arch/arm/mach-mvebu/dove.c | 5 +-
arch/arm/mach-mvebu/headsmp-a9.S | 5 +-
arch/arm/mach-mvebu/headsmp.S | 5 +-
arch/arm/mach-mvebu/kirkwood.c | 5 +-
arch/arm/mach-mvebu/kirkwood.h | 5 +-
arch/arm/mach-mvebu/mvebu-soc-id.c | 5 +-
arch/arm/mach-mvebu/mvebu-soc-id.h | 5 +-
arch/arm/mach-mvebu/platsmp-a9.c | 5 +-
arch/arm/mach-mvebu/platsmp.c | 5 +-
arch/arm/mach-mvebu/pm-board.c | 5 +-
arch/arm/mach-mvebu/pm.c | 5 +-
arch/arm/mach-mvebu/pmsu.c | 5 +-
arch/arm/mach-mvebu/pmsu.h | 5 +-
arch/arm/mach-mvebu/pmsu_ll.S | 5 +-
arch/arm/mach-mvebu/system-controller.c | 5 +-
arch/arm/mach-omap1/include/mach/mtd-xip.h | 6 +-
arch/arm/mach-omap1/pm_bus.c | 6 +-
arch/arm/mach-omap2/prcm43xx.h | 5 +-
arch/arm/mach-omap2/vc.c | 6 +-
arch/arm/mach-orion5x/board-d2net.c | 5 +-
arch/arm/mach-orion5x/board-dt.c | 5 +-
arch/arm/mach-orion5x/board-rd88f5182.c | 5 +-
arch/arm/mach-orion5x/bridge-regs.h | 9 +-
arch/arm/mach-orion5x/common.c | 5 +-
arch/arm/mach-orion5x/db88f5281-setup.c | 5 +-
arch/arm/mach-orion5x/irq.c | 5 +-
arch/arm/mach-orion5x/irqs.h | 5 +-
arch/arm/mach-orion5x/kurobox_pro-setup.c | 5 +-
arch/arm/mach-orion5x/ls_hgl-setup.c | 5 +-
arch/arm/mach-orion5x/mpp.c | 5 +-
arch/arm/mach-orion5x/net2big-setup.c | 6 +-
arch/arm/mach-orion5x/orion5x.h | 5 +-
arch/arm/mach-orion5x/pci.c | 5 +-
arch/arm/mach-orion5x/rd88f5181l-fxo-setup.c | 5 +-
arch/arm/mach-orion5x/rd88f5181l-ge-setup.c | 5 +-
arch/arm/mach-orion5x/rd88f5182-setup.c | 5 +-
arch/arm/mach-orion5x/rd88f6183ap-ge-setup.c | 5 +-
arch/arm/mach-orion5x/ts78xx-setup.c | 5 +-
arch/arm/mach-orion5x/wnr854t-setup.c | 9 +-
arch/arm/mach-orion5x/wrt350n-v2-setup.c | 9 +-
arch/arm/mach-pxa/eseries.c | 7 +-
arch/arm/mach-pxa/standby.S | 6 +-
arch/arm/mach-spear/generic.h | 5 +-
arch/arm/mach-spear/include/mach/misc_regs.h | 5 +-
arch/arm/mach-spear/include/mach/spear.h | 5 +-
arch/arm/mach-spear/pl080.c | 5 +-
arch/arm/mach-spear/pl080.h | 5 +-
arch/arm/mach-spear/restart.c | 5 +-
arch/arm/mach-spear/spear1310.c | 5 +-
arch/arm/mach-spear/spear1340.c | 5 +-
arch/arm/mach-spear/spear13xx.c | 5 +-
arch/arm/mach-spear/spear300.c | 5 +-
arch/arm/mach-spear/spear310.c | 5 +-
arch/arm/mach-spear/spear320.c | 5 +-
arch/arm/mach-spear/spear3xx.c | 5 +-
arch/arm/mach-spear/spear6xx.c | 5 +-
arch/arm/mach-spear/time.c | 5 +-
arch/arm/mm/cache-feroceon-l2.c | 5 +-
arch/arm/mm/cache-tauros2.c | 5 +-
arch/mips/alchemy/devboards/db1000.c | 4 +
arch/mips/alchemy/devboards/db1200.c | 6 +
arch/mips/alchemy/devboards/db1300.c | 4 +
arch/parisc/include/asm/ropes.h | 3 +
arch/parisc/kernel/drivers.c | 2 +-
arch/parisc/kernel/irq.c | 2 +-
arch/powerpc/kernel/hw_breakpoint.c | 16 +-
arch/powerpc/kernel/hw_breakpoint_constraints.c | 7 +-
arch/powerpc/perf/hv-24x7.c | 2 +-
arch/x86/kernel/cpu/bugs.c | 4 +-
arch/x86/kernel/cpu/common.c | 2 +-
arch/xtensa/boot/Makefile | 3 +-
arch/xtensa/boot/lib/zmem.c | 5 +-
arch/xtensa/include/asm/core.h | 4 +
arch/xtensa/platforms/iss/network.c | 4 +-
drivers/ata/ahci.c | 111 +++---
drivers/ata/ahci_brcm.c | 2 +-
drivers/ata/ahci_xgene.c | 4 -
drivers/ata/libahci.c | 49 +--
drivers/ata/libata-core.c | 47 ++-
drivers/ata/libata-eh.c | 13 +-
drivers/ata/libata-sata.c | 2 +-
drivers/ata/libata-scsi.c | 2 +-
drivers/ata/libata-transport.c | 9 +-
drivers/ata/libata.h | 2 +
drivers/bus/ti-sysc.c | 31 +-
drivers/char/agp/parisc-agp.c | 2 -
drivers/clk/tegra/clk-bpmp.c | 2 +-
drivers/firmware/imx/imx-dsp.c | 1 +
drivers/gpio/gpio-pmic-eic-sprd.c | 1 +
drivers/gpio/gpio-tb10x.c | 6 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 17 +-
.../amd/display/dc/dce110/dce110_hw_sequencer.c | 4 +-
drivers/gpu/drm/bridge/ti-sn65dsi83.c | 4 +-
drivers/gpu/drm/meson/meson_encoder_hdmi.c | 2 +
drivers/i2c/busses/i2c-i801.c | 1 +
drivers/i2c/busses/i2c-npcm7xx.c | 17 +-
drivers/i2c/muxes/i2c-demux-pinctrl.c | 4 +
drivers/i2c/muxes/i2c-mux-gpio.c | 47 +--
.../serio/{i8042-x86ia64io.h => i8042-acpipnpio.h} | 13 +-
drivers/input/serio/i8042.h | 2 +-
drivers/interconnect/core.c | 1 +
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 27 +-
drivers/media/common/videobuf2/frame_vector.c | 6 +-
drivers/misc/cardreader/rts5227.c | 55 +--
drivers/misc/cardreader/rts5228.c | 57 +--
drivers/misc/cardreader/rts5249.c | 56 +--
drivers/misc/cardreader/rts5260.c | 43 +--
drivers/misc/cardreader/rts5261.c | 52 +--
drivers/misc/cardreader/rtsx_pcr.c | 51 ++-
drivers/net/ethernet/amazon/ena/ena_netdev.c | 3 +
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 5 +
drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 9 +
.../ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 13 +-
drivers/net/ethernet/intel/i40e/i40e.h | 1 +
drivers/net/ethernet/intel/i40e/i40e_ethtool.c | 9 +
drivers/net/ethernet/intel/i40e/i40e_main.c | 138 ++++++-
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 16 +-
drivers/net/ethernet/intel/iavf/iavf_main.c | 3 +-
drivers/net/ethernet/intel/igc/igc_ethtool.c | 31 +-
drivers/net/ethernet/intel/igc/igc_main.c | 2 +-
drivers/net/ethernet/pensando/ionic/ionic_dev.h | 1 +
drivers/net/ethernet/pensando/ionic/ionic_txrx.c | 10 +-
drivers/net/team/team.c | 10 +-
drivers/net/thunderbolt.c | 3 +-
drivers/nvme/host/pci.c | 121 ++++---
drivers/parisc/iosapic.c | 4 +-
drivers/parisc/iosapic_private.h | 4 +-
drivers/platform/mellanox/Kconfig | 1 +
drivers/platform/x86/intel_scu_ipc.c | 66 ++--
drivers/power/supply/ucs1002_power.c | 3 +-
drivers/scsi/pm8001/pm8001_hwi.c | 2 +-
drivers/scsi/pm8001/pm80xx_hwi.c | 4 +-
drivers/scsi/qedf/qedf_io.c | 10 +-
drivers/scsi/qedf/qedf_main.c | 7 +-
drivers/scsi/qla2xxx/qla_def.h | 3 +
drivers/scsi/qla2xxx/qla_init.c | 5 +-
drivers/scsi/qla2xxx/qla_inline.h | 58 +++
drivers/scsi/qla2xxx/qla_isr.c | 12 +-
drivers/scsi/qla2xxx/qla_nvme.c | 4 +
drivers/scsi/qla2xxx/qla_os.c | 6 +
drivers/scsi/qla2xxx/qla_target.c | 3 +-
drivers/scsi/qla2xxx/tcm_qla2xxx.c | 4 +-
drivers/scsi/ufs/ufshcd.c | 6 +-
drivers/soc/imx/soc-imx8m.c | 10 +
drivers/spi/spi-nxp-fspi.c | 7 +
drivers/spi/spi-stm32.c | 8 +
drivers/spi/spi-sun6i.c | 31 +-
drivers/tty/n_gsm.c | 4 +-
drivers/tty/serial/8250/8250_port.c | 5 +-
drivers/video/fbdev/Kconfig | 2 +-
drivers/watchdog/iTCO_wdt.c | 26 +-
fs/binfmt_elf_fdpic.c | 5 +-
fs/btrfs/delayed-inode.c | 7 +-
fs/btrfs/extent_io.c | 8 +-
fs/btrfs/super.c | 2 +-
fs/cifs/inode.c | 2 +-
fs/cifs/smb2ops.c | 6 +-
fs/ext4/ext4.h | 2 +-
fs/ext4/mballoc.c | 67 ++--
fs/nfs/direct.c | 25 +-
fs/nfs/flexfilelayout/flexfilelayout.c | 1 +
fs/nfs/nfs4client.c | 9 +-
fs/nfs/nfs4proc.c | 4 +
fs/nfs/write.c | 23 +-
fs/nilfs2/gcinode.c | 6 +-
fs/proc/task_nommu.c | 27 +-
fs/xfs/scrub/common.c | 25 --
fs/xfs/scrub/common.h | 2 -
fs/xfs/scrub/fscounters.c | 13 +-
fs/xfs/scrub/scrub.c | 2 -
fs/xfs/scrub/scrub.h | 1 -
fs/xfs/xfs_icache.c | 92 +++--
fs/xfs/xfs_icache.h | 1 +
fs/xfs/xfs_mount.h | 5 +-
fs/xfs/xfs_qm_syscalls.c | 9 +-
fs/xfs/xfs_super.c | 12 +-
fs/xfs/xfs_trace.h | 1 +
include/linux/btf_ids.h | 2 +-
include/linux/cgroup.h | 3 +-
include/linux/if_team.h | 2 +
include/linux/libata.h | 4 +-
include/linux/nfs_fs_sb.h | 1 +
include/linux/nfs_page.h | 4 +-
include/linux/seqlock.h | 2 +-
include/net/netfilter/nf_tables.h | 127 +++----
include/uapi/linux/bpf.h | 4 +-
io_uring/io_uring.c | 2 +-
kernel/bpf/queue_stack_maps.c | 21 +-
kernel/dma/debug.c | 20 +-
kernel/sched/core.c | 2 +-
kernel/sched/cpuacct.c | 4 +-
kernel/sched/cpupri.c | 1 +
kernel/sched/idle.c | 1 +
kernel/trace/ring_buffer.c | 10 +
kernel/trace/trace.c | 45 ++-
kernel/trace/trace.h | 2 +
kernel/trace/trace_events.c | 6 +-
kernel/trace/trace_events_inject.c | 3 +-
net/bridge/br_forward.c | 4 +-
net/bridge/br_input.c | 4 +-
net/core/flow_dissector.c | 2 +-
net/dccp/ipv4.c | 9 +-
net/dccp/ipv6.c | 9 +-
net/ipv4/route.c | 4 +-
net/ncsi/ncsi-aen.c | 5 +
net/netfilter/ipset/ip_set_core.c | 12 +-
net/netfilter/nf_tables_api.c | 400 +++++++++++++++++----
net/netfilter/nft_exthdr.c | 110 +++++-
net/netfilter/nft_set_hash.c | 87 +++--
net/netfilter/nft_set_pipapo.c | 71 ++--
net/netfilter/nft_set_rbtree.c | 161 +++++----
net/rds/rdma_transport.c | 12 +-
net/sunrpc/clnt.c | 15 +-
security/smack/smack.h | 1 +
security/smack/smack_lsm.c | 65 +++-
sound/pci/hda/hda_intel.c | 1 +
sound/soc/fsl/imx-audmix.c | 2 +-
sound/soc/fsl/imx-pcm-rpmsg.c | 1 +
sound/soc/fsl/imx-rpmsg.c | 8 +
sound/soc/meson/axg-spdifin.c | 49 +--
tools/build/Makefile.build | 10 +
tools/include/linux/btf_ids.h | 2 +-
tools/include/uapi/linux/bpf.h | 4 +-
tools/perf/Makefile.config | 19 +
tools/perf/Makefile.perf | 1 +
tools/perf/pmu-events/Build | 19 +-
tools/perf/pmu-events/empty-pmu-events.c | 158 ++++++++
tools/perf/util/Build | 6 +
tools/perf/util/metricgroup.c | 3 +
.../ftrace/test.d/instances/instance-event.tc | 2 +-
tools/testing/selftests/kselftest_deps.sh | 77 +++-
tools/testing/selftests/net/tls.c | 8 +-
332 files changed, 2602 insertions(+), 1896 deletions(-)



2023-10-04 18:44:10

by Florian Fainelli

[permalink] [raw]
Subject: Re: [PATCH 5.15 000/183] 5.15.134-rc1 review

On 10/4/23 10:53, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.15.134 release.
> There are 183 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Fri, 06 Oct 2023 17:51:12 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.134-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

perf fails to build with:

CC
/local/users/fainelli/buildroot/output/arm/build/linux-custom/tools/perf/util/metricgroup.o
util/metricgroup.c: In function 'metricgroup__parse_groups':
util/metricgroup.c:1261:7: error: 'table' undeclared (first use in this
function)
if (!table)
^~~~~
util/metricgroup.c:1261:7: note: each undeclared identifier is reported
only once for each function it appears in
make[6]: ***
[/local/users/fainelli/buildroot/output/arm/build/linux-custom/tools/build/Makefile.build:97:
/local/users/fainelli/buildroot/output/arm/build/linux-custom/tools/perf/util/metricgroup.o]
Error 1

caused by c1ef510a0f2a879bf29ddebae766ec9f0790eb8f ("perf metric: Return
early if no CPU PMU table exists"). Dropping this commit allows the
build to continue.

I had reported in the previous cycle that 00facc760903be66 ("perf
jevents: Switch build to use jevents.py") was causing build failures:

https://lore.kernel.org/all/[email protected]/

do we still want these commits to be included?
--
Florian

2023-10-05 01:02:16

by Shuah Khan

[permalink] [raw]
Subject: Re: [PATCH 5.15 000/183] 5.15.134-rc1 review

On 10/4/23 11:53, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.15.134 release.
> There are 183 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Fri, 06 Oct 2023 17:51:12 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.134-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>

Compiled and booted on my test system. No dmesg regressions.

Tested-by: Shuah Khan <[email protected]>

thanks,
-- Shuah

2023-10-05 01:31:39

by SeongJae Park

[permalink] [raw]
Subject: Re: [PATCH 5.15 000/183] 5.15.134-rc1 review

Hello,

On Wed, 4 Oct 2023 19:53:51 +0200 Greg Kroah-Hartman <[email protected]> wrote:

> This is the start of the stable review cycle for the 5.15.134 release.
> There are 183 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Fri, 06 Oct 2023 17:51:12 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.134-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
> and the diffstat can be found below.

This rc kernel passes DAMON functionality test[1] on my test machine.
Attaching the test results summary below. Please note that I retrieved the
kernel from linux-stable-rc tree[2].

Tested-by: SeongJae Park <[email protected]>

[1] https://github.com/awslabs/damon-tests/tree/next/corr
[2] 6f28ecf24aef ("Linux 5.15.134-rc1")

Thanks,
SJ

[...]

---

# .config:1347:warning: override: reassigning to symbol DAMON
ok 12 selftests: damon-tests: build_i386_idle_flag.sh
# selftests: damon-tests: build_i386_highpte.sh
# .config:1347:warning: override: reassigning to symbol DAMON
ok 13 selftests: damon-tests: build_i386_highpte.sh
# selftests: damon-tests: build_nomemcg.sh
# .config:1348:warning: override: reassigning to symbol DAMON
# .config:1358:warning: override: reassigning to symbol CGROUPS
ok 14 selftests: damon-tests: build_nomemcg.sh
# kselftest dir '/home/sjpark/damon-tests-cont/linux/tools/testing/selftests/damon-tests' is in dirty state.
# the log is at '/home/sjpark/log'.
[32m
ok 1 selftests: damon: debugfs_attrs.sh
ok 1 selftests: damon-tests: kunit.sh
ok 2 selftests: damon-tests: huge_count_read_write.sh
ok 3 selftests: damon-tests: buffer_overflow.sh
ok 4 selftests: damon-tests: rm_contexts.sh
ok 5 selftests: damon-tests: record_null_deref.sh
ok 6 selftests: damon-tests: dbgfs_target_ids_read_before_terminate_race.sh
ok 7 selftests: damon-tests: dbgfs_target_ids_pid_leak.sh
ok 8 selftests: damon-tests: damo_tests.sh
ok 9 selftests: damon-tests: masim-record.sh
ok 10 selftests: damon-tests: build_i386.sh
ok 11 selftests: damon-tests: build_arm64.sh
ok 12 selftests: damon-tests: build_i386_idle_flag.sh
ok 13 selftests: damon-tests: build_i386_highpte.sh
ok 14 selftests: damon-tests: build_nomemcg.sh
[33m
[92mPASS [39m
_remote_run_corr.sh SUCCESS

2023-10-05 17:49:47

by Naresh Kamboju

[permalink] [raw]
Subject: Re: [PATCH 5.15 000/183] 5.15.134-rc1 review

On Wed, 4 Oct 2023 at 23:33, Greg Kroah-Hartman
<[email protected]> wrote:
>
> This is the start of the stable review cycle for the 5.15.134 release.
> There are 183 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Fri, 06 Oct 2023 17:51:12 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.134-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

Results from Linaro’s test farm.
Regressions on x86.

Following kernel warning noticed on x86 while booting stable-rc 5.15.134-rc1
with selftest merge config built kernel.

Reported-by: Linux Kernel Functional Testing <[email protected]>

Anyone noticed this kernel warning ?

This is always reproducible while booting x86 with a given config.

x86 boot log:
-----
[ 0.000000] Linux version 5.15.134-rc1 (tuxmake@tuxmake)
(x86_64-linux-gnu-gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils
for Debian) 2.40) #1 SMP @1696443178
...
[ 1.480701] ------------[ cut here ]------------
[ 1.481296] WARNING: CPU: 0 PID: 13 at kernel/rcu/tasks.h:958
trc_inspect_reader+0x80/0xb0
[ 1.481296] Modules linked in:
[ 1.481296] CPU: 0 PID: 13 Comm: rcu_tasks_trace Not tainted 5.15.134-rc1 #1
[ 1.481296] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
2.5 11/26/2020
[ 1.481296] RIP: 0010:trc_inspect_reader+0x80/0xb0
[ 1.481296] Code: b6 83 45 04 00 00 84 c0 75 48 c6 83 45 04 00 00
01 b8 01 00 00 00 5b 41 5c 5d c3 cc cc cc cc 0f 94 c0 eb b4 f6 43 2c
02 75 02 <0f> 0b 48 83 05 36 f8 ee 02 01 b8 01 00 00 00 48 83 05 21 f8
ee 02
[ 1.481296] RSP: 0000:ffffb25e000afd70 EFLAGS: 00010046
[ 1.481296] RAX: 0000000000000000 RBX: ffff9b40c080d040 RCX: 0000000000000003
[ 1.481296] RDX: ffff9b4427b80000 RSI: 0000000000000000 RDI: ffff9b40c080d040
[ 1.481296] RBP: ffffb25e000afd80 R08: e32db91cdfdc3bef R09: 00000000035b89d4
[ 1.481296] R10: 000000006a495065 R11: 0000000000000030 R12: ffffffffae692100
[ 1.481296] R13: 0000000000000000 R14: ffff9b40c080d9a8 R15: 0000000000000000
[ 1.481296] FS: 0000000000000000(0000) GS:ffff9b4427a00000(0000)
knlGS:0000000000000000
[ 1.481296] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1.481296] CR2: ffff9b4297201000 CR3: 00000002d5e26001 CR4: 00000000003706f0
[ 1.481296] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1.481296] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1.481296] Call Trace:
[ 1.481296] <TASK>
[ 1.481296] ? show_regs.cold+0x1a/0x1f
[ 1.481296] ? __warn+0x88/0x120
[ 1.481296] ? trc_inspect_reader+0x80/0xb0
[ 1.481296] ? report_bug+0xa8/0xd0
[ 1.481296] ? handle_bug+0x40/0x70
[ 1.481296] ? exc_invalid_op+0x18/0x70
[ 1.481296] ? asm_exc_invalid_op+0x1b/0x20
[ 1.481296] ? rcu_tasks_kthread+0x250/0x250
[ 1.481296] ? trc_inspect_reader+0x80/0xb0
[ 1.481296] ? rcu_tasks_kthread+0x250/0x250
[ 1.481296] try_invoke_on_locked_down_task+0x109/0x120
[ 1.481296] trc_wait_for_one_reader.part.0+0x48/0x270
[ 1.481296] rcu_tasks_trace_postscan+0x76/0xb0
[ 1.481296] rcu_tasks_wait_gp+0x186/0x380
[ 1.481296] ? _raw_spin_unlock_irqrestore+0x35/0x50
[ 1.481296] rcu_tasks_kthread+0x145/0x250
[ 1.481296] ? do_wait_intr_irq+0xc0/0xc0
[ 1.481296] ? synchronize_rcu_tasks_rude+0x20/0x20
[ 1.481296] kthread+0x146/0x170
[ 1.481296] ? set_kthread_struct+0x50/0x50
[ 1.481296] ret_from_fork+0x1f/0x30
[ 1.481296] </TASK>
[ 1.481296] irq event stamp: 132
[ 1.481296] hardirqs last enabled at (131): [<ffffffffaf7936a5>]
_raw_spin_unlock_irqrestore+0x35/0x50
[ 1.481296] hardirqs last disabled at (132): [<ffffffffaf79345b>]
_raw_spin_lock_irqsave+0x5b/0x60
[ 1.481296] softirqs last enabled at (54): [<ffffffffae69201c>]
rcu_tasks_kthread+0x16c/0x250
[ 1.481296] softirqs last disabled at (50): [<ffffffffae69201c>]
rcu_tasks_kthread+0x16c/0x250
[ 1.481296] ---[ end trace 5a00c61d8412a9ac ]---


Links:
----
- https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.15.y/build/v5.15.133-184-g6f28ecf24aef/testrun/20260259/suite/log-parser-boot/test/check-kernel-exception/log
- https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.15.y/build/v5.15.133-184-g6f28ecf24aef/testrun/20260259/suite/log-parser-boot/tests/
Build: https://storage.tuxsuite.com/public/linaro/lkft/builds/2WJFhcfqqG69pqj6LWuI14kVoP5/

steps to reproduce:
--------
- https://storage.tuxsuite.com/public/linaro/lkft/builds/2WJFhcfqqG69pqj6LWuI14kVoP5/tuxmake_reproducer.sh

## Build
* kernel: 5.15.134-rc1
* git: https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc
* git branch: linux-5.15.y
* git commit: 6f28ecf24aef2896f4071dc6268d3fb5f8259c77
* git describe: v5.15.133-184-g6f28ecf24aef
* test details:
https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.15.y/build/v5.15.133-184-g6f28ecf24aef

## Test Regressions (compared to v5.15.133)
* x86, log-parser-boot
- check-kernel-exception
- check-kernel-warning

* x86, log-parser-test
- check-kernel-exception
- check-kernel-warning


## Metric Regressions (compared to v5.15.133)

## Test Fixes (compared to v5.15.133)

## Metric Fixes (compared to v5.15.133)

## Test result summary
total: 90392, pass: 71514, fail: 2557, skip: 16224, xfail: 97

## Build Summary
* arc: 4 total, 4 passed, 0 failed
* arm: 114 total, 114 passed, 0 failed
* arm64: 42 total, 42 passed, 0 failed
* i386: 32 total, 31 passed, 1 failed
* mips: 27 total, 26 passed, 1 failed
* parisc: 4 total, 4 passed, 0 failed
* powerpc: 26 total, 25 passed, 1 failed
* riscv: 11 total, 11 passed, 0 failed
* s390: 12 total, 11 passed, 1 failed
* sh: 13 total, 11 passed, 2 failed
* sparc: 8 total, 8 passed, 0 failed
* x86_64: 38 total, 38 passed, 0 failed

## Test suites summary
* boot
* kselftest-android
* kselftest-arm64
* kselftest-breakpoints
* kselftest-capabilities
* kselftest-cgroup
* kselftest-clone3
* kselftest-core
* kselftest-cpu-hotplug
* kselftest-cpufreq
* kselftest-drivers-dma-buf
* kselftest-efivarfs
* kselftest-exec
* kselftest-filesystems
* kselftest-filesystems-binderfs
* kselftest-filesystems-epoll
* kselftest-firmware
* kselftest-fpu
* kselftest-ftrace
* kselftest-futex
* kselftest-gpio
* kselftest-intel_pstate
* kselftest-ipc
* kselftest-ir
* kselftest-kcmp
* kselftest-kexec
* kselftest-kvm
* kselftest-lib
* kselftest-membarrier
* kselftest-memfd
* kselftest-memory-hotplug
* kselftest-mincore
* kselftest-mount
* kselftest-mqueue
* kselftest-net
* kselftest-net-forwarding
* kselftest-net-mptcp
* kselftest-netfilter
* kselftest-nsfs
* kselftest-openat2
* kselftest-pid_namespace
* kselftest-pidfd
* kselftest-proc
* kselftest-pstore
* kselftest-ptrace
* kselftest-rseq
* kselftest-rtc
* kselftest-seccomp
* kselftest-sigaltstack
* kselftest-size
* kselftest-splice
* kselftest-static_keys
* kselftest-sync
* kselftest-sysctl
* kselftest-tc-testing
* kselftest-timens
* kselftest-tmpfs
* kselftest-tpm2
* kselftest-user
* kselftest-user_events
* kselftest-vDSO
* kselftest-vm
* kselftest-watchdog
* kselftest-x86
* kselftest-zram
* kunit
* kvm-unit-tests
* libgpiod
* log-parser-boot
* log-parser-test
* ltp-cap_bounds
* ltp-commands
* ltp-containers
* ltp-controllers
* ltp-cpuhotplug
* ltp-crypto
* ltp-cve
* ltp-dio
* ltp-fcntl-locktests
* ltp-filecaps
* ltp-fs
* ltp-fs_bind
* ltp-fs_perms_simple
* ltp-fsx
* ltp-hugetlb
* ltp-io
* ltp-ipc
* ltp-math
* ltp-mm
* ltp-nptl
* ltp-pty
* ltp-sched
* ltp-securebits
* ltp-smoke
* ltp-syscalls
* ltp-tracing
* network-basic-tests
* perf
* rcutorture
* v4l2-compliance

--
Linaro LKFT
https://lkft.linaro.org

2023-10-05 22:18:28

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 5.15 000/183] 5.15.134-rc1 review

On Wed, Oct 04, 2023 at 07:53:51PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.15.134 release.
> There are 183 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Fri, 06 Oct 2023 17:51:12 +0000.
> Anything received after that time might be too late.
>

Build results:
total: 160 pass: 158 fail: 2
Failed builds:
i386:tools/perf
x86_64:tools/perf
Qemu test results:
total: 509 pass: 509 fail: 0

Guenter

2023-10-06 07:43:53

by Ron Economos

[permalink] [raw]
Subject: Re: [PATCH 5.15 000/183] 5.15.134-rc1 review

On 10/4/23 10:53 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.15.134 release.
> There are 183 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Fri, 06 Oct 2023 17:51:12 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.134-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

Built and booted successfully on RISC-V RV64 (HiFive Unmatched).

Tested-by: Ron Economos <[email protected]>

2023-10-06 09:32:57

by Jon Hunter

[permalink] [raw]
Subject: Re: [PATCH 5.15 000/183] 5.15.134-rc1 review

On Wed, 04 Oct 2023 19:53:51 +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.15.134 release.
> There are 183 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Fri, 06 Oct 2023 17:51:12 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.134-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

All tests passing for Tegra ...

Test results for stable-v5.15:
10 builds: 10 pass, 0 fail
26 boots: 26 pass, 0 fail
94 tests: 94 pass, 0 fail

Linux version: 5.15.134-rc1-g6f28ecf24aef
Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000,
tegra194-p2972-0000, tegra194-p3509-0000+p3668-0000,
tegra20-ventana, tegra210-p2371-2180,
tegra210-p3450-0000, tegra30-cardhu-a04

Tested-by: Jon Hunter <[email protected]>

Jon

2023-10-06 10:25:26

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 5.15 000/183] 5.15.134-rc1 review

On Wed, Oct 04, 2023 at 11:43:46AM -0700, Florian Fainelli wrote:
> On 10/4/23 10:53, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 5.15.134 release.
> > There are 183 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Fri, 06 Oct 2023 17:51:12 +0000.
> > Anything received after that time might be too late.
> >
> > The whole patch series can be found in one patch at:
> > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.134-rc1.gz
> > or in the git tree and branch at:
> > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
> > and the diffstat can be found below.
> >
> > thanks,
> >
> > greg k-h
>
> perf fails to build with:
>
> CC /local/users/fainelli/buildroot/output/arm/build/linux-custom/tools/perf/util/metricgroup.o
> util/metricgroup.c: In function 'metricgroup__parse_groups':
> util/metricgroup.c:1261:7: error: 'table' undeclared (first use in this
> function)
> if (!table)
> ^~~~~
> util/metricgroup.c:1261:7: note: each undeclared identifier is reported only
> once for each function it appears in
> make[6]: *** [/local/users/fainelli/buildroot/output/arm/build/linux-custom/tools/build/Makefile.build:97: /local/users/fainelli/buildroot/output/arm/build/linux-custom/tools/perf/util/metricgroup.o]
> Error 1
>
> caused by c1ef510a0f2a879bf29ddebae766ec9f0790eb8f ("perf metric: Return
> early if no CPU PMU table exists"). Dropping this commit allows the build to
> continue.
>
> I had reported in the previous cycle that 00facc760903be66 ("perf jevents:
> Switch build to use jevents.py") was causing build failures:
>
> https://lore.kernel.org/all/[email protected]/
>
> do we still want these commits to be included?

No, I'll go drop them now, thanks for the report.

greg k-h

2023-10-06 10:38:28

by Harshit Mogalapalli

[permalink] [raw]
Subject: Re: [PATCH 5.15 000/183] 5.15.134-rc1 review



On 06/10/23 3:55 pm, Greg Kroah-Hartman wrote:
> On Wed, Oct 04, 2023 at 11:43:46AM -0700, Florian Fainelli wrote:
>> On 10/4/23 10:53, Greg Kroah-Hartman wrote:
>>> This is the start of the stable review cycle for the 5.15.134 release.
>>> There are 183 patches in this series, all will be posted as a response
>>> to this one. If anyone has any issues with these being applied, please
>>> let me know.
>>>
>>> Responses should be made by Fri, 06 Oct 2023 17:51:12 +0000.
>>> Anything received after that time might be too late.
>>>
>>> The whole patch series can be found in one patch at:
>>> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.134-rc1.gz
>>> or in the git tree and branch at:
>>> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
>>> and the diffstat can be found below.
>>>
>>> thanks,
>>>
>>> greg k-h
>>
>> perf fails to build with:
>>
>> CC /local/users/fainelli/buildroot/output/arm/build/linux-custom/tools/perf/util/metricgroup.o
>> util/metricgroup.c: In function 'metricgroup__parse_groups':
>> util/metricgroup.c:1261:7: error: 'table' undeclared (first use in this
>> function)
>> if (!table)
>> ^~~~~
>> util/metricgroup.c:1261:7: note: each undeclared identifier is reported only
>> once for each function it appears in
>> make[6]: *** [/local/users/fainelli/buildroot/output/arm/build/linux-custom/tools/build/Makefile.build:97: /local/users/fainelli/buildroot/output/arm/build/linux-custom/tools/perf/util/metricgroup.o]
>> Error 1
>>
>> caused by c1ef510a0f2a879bf29ddebae766ec9f0790eb8f ("perf metric: Return
>> early if no CPU PMU table exists"). Dropping this commit allows the build to
>> continue.
>>
>> I had reported in the previous cycle that 00facc760903be66 ("perf jevents:
>> Switch build to use jevents.py") was causing build failures:
>>
>> https://lore.kernel.org/all/[email protected]/
>>
>> do we still want these commits to be included?
>
> No, I'll go drop them now, thanks for the report.

Thought:
It's not the first time we see build failures in tools/perf -- would it
make sense to add this to your own build tests to reduce the round trip
time for these errors ?

Note: After reverting three patches in perf/ the build succeeds:

Patch 151: c1ef510a0f2a perf metric: Return early if no CPU PMU table exists
Patch 81: 40ddac4ffc75 perf build: Update build rule for generated files
Patch 80: 8df938ed8c8a perf jevents: Switch build to use jevents.py


Thanks,
Harshit

>
> greg k-h

2023-10-06 11:04:12

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 5.15 000/183] 5.15.134-rc1 review

On Fri, Oct 06, 2023 at 04:07:14PM +0530, Harshit Mogalapalli wrote:
>
>
> On 06/10/23 3:55 pm, Greg Kroah-Hartman wrote:
> > On Wed, Oct 04, 2023 at 11:43:46AM -0700, Florian Fainelli wrote:
> > > On 10/4/23 10:53, Greg Kroah-Hartman wrote:
> > > > This is the start of the stable review cycle for the 5.15.134 release.
> > > > There are 183 patches in this series, all will be posted as a response
> > > > to this one. If anyone has any issues with these being applied, please
> > > > let me know.
> > > >
> > > > Responses should be made by Fri, 06 Oct 2023 17:51:12 +0000.
> > > > Anything received after that time might be too late.
> > > >
> > > > The whole patch series can be found in one patch at:
> > > > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.134-rc1.gz
> > > > or in the git tree and branch at:
> > > > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
> > > > and the diffstat can be found below.
> > > >
> > > > thanks,
> > > >
> > > > greg k-h
> > >
> > > perf fails to build with:
> > >
> > > CC /local/users/fainelli/buildroot/output/arm/build/linux-custom/tools/perf/util/metricgroup.o
> > > util/metricgroup.c: In function 'metricgroup__parse_groups':
> > > util/metricgroup.c:1261:7: error: 'table' undeclared (first use in this
> > > function)
> > > if (!table)
> > > ^~~~~
> > > util/metricgroup.c:1261:7: note: each undeclared identifier is reported only
> > > once for each function it appears in
> > > make[6]: *** [/local/users/fainelli/buildroot/output/arm/build/linux-custom/tools/build/Makefile.build:97: /local/users/fainelli/buildroot/output/arm/build/linux-custom/tools/perf/util/metricgroup.o]
> > > Error 1
> > >
> > > caused by c1ef510a0f2a879bf29ddebae766ec9f0790eb8f ("perf metric: Return
> > > early if no CPU PMU table exists"). Dropping this commit allows the build to
> > > continue.
> > >
> > > I had reported in the previous cycle that 00facc760903be66 ("perf jevents:
> > > Switch build to use jevents.py") was causing build failures:
> > >
> > > https://lore.kernel.org/all/[email protected]/
> > >
> > > do we still want these commits to be included?
> >
> > No, I'll go drop them now, thanks for the report.
>
> Thought:
> It's not the first time we see build failures in tools/perf -- would it make
> sense to add this to your own build tests to reduce the round trip time for
> these errors ?

Last time I tried to build perf, I couldn't do it at all so I just gave
up trying to test for it :)

> Note: After reverting three patches in perf/ the build succeeds:
>
> Patch 151: c1ef510a0f2a perf metric: Return early if no CPU PMU table exists
> Patch 81: 40ddac4ffc75 perf build: Update build rule for generated files
> Patch 80: 8df938ed8c8a perf jevents: Switch build to use jevents.py

All of these are now dropped.

thanks,

greg k-h

2023-10-06 12:16:29

by Sasha Levin

[permalink] [raw]
Subject: Re: [PATCH 5.15 000/183] 5.15.134-rc1 review

On Fri, Oct 06, 2023 at 01:03:30PM +0200, Greg Kroah-Hartman wrote:
>On Fri, Oct 06, 2023 at 04:07:14PM +0530, Harshit Mogalapalli wrote:
>>
>>
>> On 06/10/23 3:55 pm, Greg Kroah-Hartman wrote:
>> > On Wed, Oct 04, 2023 at 11:43:46AM -0700, Florian Fainelli wrote:
>> > > On 10/4/23 10:53, Greg Kroah-Hartman wrote:
>> > > > This is the start of the stable review cycle for the 5.15.134 release.
>> > > > There are 183 patches in this series, all will be posted as a response
>> > > > to this one. If anyone has any issues with these being applied, please
>> > > > let me know.
>> > > >
>> > > > Responses should be made by Fri, 06 Oct 2023 17:51:12 +0000.
>> > > > Anything received after that time might be too late.
>> > > >
>> > > > The whole patch series can be found in one patch at:
>> > > > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.134-rc1.gz
>> > > > or in the git tree and branch at:
>> > > > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
>> > > > and the diffstat can be found below.
>> > > >
>> > > > thanks,
>> > > >
>> > > > greg k-h
>> > >
>> > > perf fails to build with:
>> > >
>> > > CC /local/users/fainelli/buildroot/output/arm/build/linux-custom/tools/perf/util/metricgroup.o
>> > > util/metricgroup.c: In function 'metricgroup__parse_groups':
>> > > util/metricgroup.c:1261:7: error: 'table' undeclared (first use in this
>> > > function)
>> > > if (!table)
>> > > ^~~~~
>> > > util/metricgroup.c:1261:7: note: each undeclared identifier is reported only
>> > > once for each function it appears in
>> > > make[6]: *** [/local/users/fainelli/buildroot/output/arm/build/linux-custom/tools/build/Makefile.build:97: /local/users/fainelli/buildroot/output/arm/build/linux-custom/tools/perf/util/metricgroup.o]
>> > > Error 1
>> > >
>> > > caused by c1ef510a0f2a879bf29ddebae766ec9f0790eb8f ("perf metric: Return
>> > > early if no CPU PMU table exists"). Dropping this commit allows the build to
>> > > continue.
>> > >
>> > > I had reported in the previous cycle that 00facc760903be66 ("perf jevents:
>> > > Switch build to use jevents.py") was causing build failures:
>> > >
>> > > https://lore.kernel.org/all/[email protected]/
>> > >
>> > > do we still want these commits to be included?
>> >
>> > No, I'll go drop them now, thanks for the report.
>>
>> Thought:
>> It's not the first time we see build failures in tools/perf -- would it make
>> sense to add this to your own build tests to reduce the round trip time for
>> these errors ?
>
>Last time I tried to build perf, I couldn't do it at all so I just gave
>up trying to test for it :)

Same... I've also removed perf from AUTOSEL for that reason.

--
Thanks,
Sasha

2023-10-06 16:22:31

by Liam R. Howlett

[permalink] [raw]
Subject: Re: [PATCH 5.15 000/183] 5.15.134-rc1 review

* Naresh Kamboju <[email protected]> [231005 13:49]:
> On Wed, 4 Oct 2023 at 23:33, Greg Kroah-Hartman
> <[email protected]> wrote:
> >
> > This is the start of the stable review cycle for the 5.15.134 release.
> > There are 183 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Fri, 06 Oct 2023 17:51:12 +0000.
> > Anything received after that time might be too late.
> >
> > The whole patch series can be found in one patch at:
> > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.134-rc1.gz
> > or in the git tree and branch at:
> > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
> > and the diffstat can be found below.
> >
> > thanks,
> >
> > greg k-h
>
> Results from Linaro’s test farm.
> Regressions on x86.
>
> Following kernel warning noticed on x86 while booting stable-rc 5.15.134-rc1
> with selftest merge config built kernel.
>
> Reported-by: Linux Kernel Functional Testing <[email protected]>
>
> Anyone noticed this kernel warning ?
>
> This is always reproducible while booting x86 with a given config.

From that config:
#
# RCU Subsystem
#
CONFIG_TREE_RCU=y
# CONFIG_RCU_EXPERT is not set
CONFIG_SRCU=y
CONFIG_TREE_SRCU=y
CONFIG_TASKS_RCU_GENERIC=y
CONFIG_TASKS_RUDE_RCU=y
CONFIG_TASKS_TRACE_RCU=y
CONFIG_RCU_STALL_COMMON=y
CONFIG_RCU_NEED_SEGCBLIST=y
# end of RCU Subsystem

#
# RCU Debugging
#
CONFIG_PROVE_RCU=y
# CONFIG_RCU_SCALE_TEST is not set
# CONFIG_RCU_TORTURE_TEST is not set
# CONFIG_RCU_REF_SCALE_TEST is not set
CONFIG_RCU_CPU_STALL_TIMEOUT=21
CONFIG_RCU_TRACE=y
# CONFIG_RCU_EQS_DEBUG is not set
# end of RCU Debugging


>
> x86 boot log:
> -----
> [ 0.000000] Linux version 5.15.134-rc1 (tuxmake@tuxmake)
> (x86_64-linux-gnu-gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils
> for Debian) 2.40) #1 SMP @1696443178
> ...
> [ 1.480701] ------------[ cut here ]------------
> [ 1.481296] WARNING: CPU: 0 PID: 13 at kernel/rcu/tasks.h:958
> trc_inspect_reader+0x80/0xb0
> [ 1.481296] Modules linked in:
> [ 1.481296] CPU: 0 PID: 13 Comm: rcu_tasks_trace Not tainted 5.15.134-rc1 #1
> [ 1.481296] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
> 2.5 11/26/2020
> [ 1.481296] RIP: 0010:trc_inspect_reader+0x80/0xb0

This function has changed a lot, including the dropping of this
WARN_ON_ONCE(). The warning was replaced in 897ba84dc5aa ("rcu-tasks:
Handle idle tasks for recently offlined CPUs") with something that looks
equivalent so I'm not sure why it would not trigger in newer revisions.

Obviously the behaviour I changed was the test for the task being idle.
I am not sure how best to short-circuit that test from happening during
boot as I am not familiar with the RCU code.

It's also worth noting that the bug this fixes wasn't exposed until the
maple tree (added in v6.1) was used for the IRQ descriptors (added in
v6.5).

> [ 1.481296] Code: b6 83 45 04 00 00 84 c0 75 48 c6 83 45 04 00 00
> 01 b8 01 00 00 00 5b 41 5c 5d c3 cc cc cc cc 0f 94 c0 eb b4 f6 43 2c
> 02 75 02 <0f> 0b 48 83 05 36 f8 ee 02 01 b8 01 00 00 00 48 83 05 21 f8
> ee 02
> [ 1.481296] RSP: 0000:ffffb25e000afd70 EFLAGS: 00010046
> [ 1.481296] RAX: 0000000000000000 RBX: ffff9b40c080d040 RCX: 0000000000000003
> [ 1.481296] RDX: ffff9b4427b80000 RSI: 0000000000000000 RDI: ffff9b40c080d040
> [ 1.481296] RBP: ffffb25e000afd80 R08: e32db91cdfdc3bef R09: 00000000035b89d4
> [ 1.481296] R10: 000000006a495065 R11: 0000000000000030 R12: ffffffffae692100
> [ 1.481296] R13: 0000000000000000 R14: ffff9b40c080d9a8 R15: 0000000000000000
> [ 1.481296] FS: 0000000000000000(0000) GS:ffff9b4427a00000(0000)
> knlGS:0000000000000000
> [ 1.481296] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1.481296] CR2: ffff9b4297201000 CR3: 00000002d5e26001 CR4: 00000000003706f0
> [ 1.481296] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 1.481296] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 1.481296] Call Trace:
> [ 1.481296] <TASK>
> [ 1.481296] ? show_regs.cold+0x1a/0x1f
> [ 1.481296] ? __warn+0x88/0x120
> [ 1.481296] ? trc_inspect_reader+0x80/0xb0
> [ 1.481296] ? report_bug+0xa8/0xd0
> [ 1.481296] ? handle_bug+0x40/0x70
> [ 1.481296] ? exc_invalid_op+0x18/0x70
> [ 1.481296] ? asm_exc_invalid_op+0x1b/0x20
> [ 1.481296] ? rcu_tasks_kthread+0x250/0x250
> [ 1.481296] ? trc_inspect_reader+0x80/0xb0
> [ 1.481296] ? rcu_tasks_kthread+0x250/0x250
> [ 1.481296] try_invoke_on_locked_down_task+0x109/0x120
> [ 1.481296] trc_wait_for_one_reader.part.0+0x48/0x270
> [ 1.481296] rcu_tasks_trace_postscan+0x76/0xb0
> [ 1.481296] rcu_tasks_wait_gp+0x186/0x380
> [ 1.481296] ? _raw_spin_unlock_irqrestore+0x35/0x50
> [ 1.481296] rcu_tasks_kthread+0x145/0x250
> [ 1.481296] ? do_wait_intr_irq+0xc0/0xc0
> [ 1.481296] ? synchronize_rcu_tasks_rude+0x20/0x20
> [ 1.481296] kthread+0x146/0x170
> [ 1.481296] ? set_kthread_struct+0x50/0x50
> [ 1.481296] ret_from_fork+0x1f/0x30
> [ 1.481296] </TASK>
> [ 1.481296] irq event stamp: 132
> [ 1.481296] hardirqs last enabled at (131): [<ffffffffaf7936a5>]
> _raw_spin_unlock_irqrestore+0x35/0x50
> [ 1.481296] hardirqs last disabled at (132): [<ffffffffaf79345b>]
> _raw_spin_lock_irqsave+0x5b/0x60
> [ 1.481296] softirqs last enabled at (54): [<ffffffffae69201c>]
> rcu_tasks_kthread+0x16c/0x250
> [ 1.481296] softirqs last disabled at (50): [<ffffffffae69201c>]
> rcu_tasks_kthread+0x16c/0x250
> [ 1.481296] ---[ end trace 5a00c61d8412a9ac ]---
>
>
> Links:
> ----
> - https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.15.y/build/v5.15.133-184-g6f28ecf24aef/testrun/20260259/suite/log-parser-boot/test/check-kernel-exception/log
> - https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.15.y/build/v5.15.133-184-g6f28ecf24aef/testrun/20260259/suite/log-parser-boot/tests/
> Build: https://storage.tuxsuite.com/public/linaro/lkft/builds/2WJFhcfqqG69pqj6LWuI14kVoP5/
>
> steps to reproduce:
> --------
> - https://storage.tuxsuite.com/public/linaro/lkft/builds/2WJFhcfqqG69pqj6LWuI14kVoP5/tuxmake_reproducer.sh
>
> ## Build
> * kernel: 5.15.134-rc1
> * git: https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc
> * git branch: linux-5.15.y
> * git commit: 6f28ecf24aef2896f4071dc6268d3fb5f8259c77
> * git describe: v5.15.133-184-g6f28ecf24aef
> * test details:
> https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.15.y/build/v5.15.133-184-g6f28ecf24aef
>
> ## Test Regressions (compared to v5.15.133)
> * x86, log-parser-boot
> - check-kernel-exception
> - check-kernel-warning
>
> * x86, log-parser-test
> - check-kernel-exception
> - check-kernel-warning
>
>
> ## Metric Regressions (compared to v5.15.133)
>
> ## Test Fixes (compared to v5.15.133)
>
> ## Metric Fixes (compared to v5.15.133)
>
> ## Test result summary
> total: 90392, pass: 71514, fail: 2557, skip: 16224, xfail: 97
>
> ## Build Summary
> * arc: 4 total, 4 passed, 0 failed
> * arm: 114 total, 114 passed, 0 failed
> * arm64: 42 total, 42 passed, 0 failed
> * i386: 32 total, 31 passed, 1 failed
> * mips: 27 total, 26 passed, 1 failed
> * parisc: 4 total, 4 passed, 0 failed
> * powerpc: 26 total, 25 passed, 1 failed
> * riscv: 11 total, 11 passed, 0 failed
> * s390: 12 total, 11 passed, 1 failed
> * sh: 13 total, 11 passed, 2 failed
> * sparc: 8 total, 8 passed, 0 failed
> * x86_64: 38 total, 38 passed, 0 failed
>
> ## Test suites summary
> * boot
> * kselftest-android
> * kselftest-arm64
> * kselftest-breakpoints
> * kselftest-capabilities
> * kselftest-cgroup
> * kselftest-clone3
> * kselftest-core
> * kselftest-cpu-hotplug
> * kselftest-cpufreq
> * kselftest-drivers-dma-buf
> * kselftest-efivarfs
> * kselftest-exec
> * kselftest-filesystems
> * kselftest-filesystems-binderfs
> * kselftest-filesystems-epoll
> * kselftest-firmware
> * kselftest-fpu
> * kselftest-ftrace
> * kselftest-futex
> * kselftest-gpio
> * kselftest-intel_pstate
> * kselftest-ipc
> * kselftest-ir
> * kselftest-kcmp
> * kselftest-kexec
> * kselftest-kvm
> * kselftest-lib
> * kselftest-membarrier
> * kselftest-memfd
> * kselftest-memory-hotplug
> * kselftest-mincore
> * kselftest-mount
> * kselftest-mqueue
> * kselftest-net
> * kselftest-net-forwarding
> * kselftest-net-mptcp
> * kselftest-netfilter
> * kselftest-nsfs
> * kselftest-openat2
> * kselftest-pid_namespace
> * kselftest-pidfd
> * kselftest-proc
> * kselftest-pstore
> * kselftest-ptrace
> * kselftest-rseq
> * kselftest-rtc
> * kselftest-seccomp
> * kselftest-sigaltstack
> * kselftest-size
> * kselftest-splice
> * kselftest-static_keys
> * kselftest-sync
> * kselftest-sysctl
> * kselftest-tc-testing
> * kselftest-timens
> * kselftest-tmpfs
> * kselftest-tpm2
> * kselftest-user
> * kselftest-user_events
> * kselftest-vDSO
> * kselftest-vm
> * kselftest-watchdog
> * kselftest-x86
> * kselftest-zram
> * kunit
> * kvm-unit-tests
> * libgpiod
> * log-parser-boot
> * log-parser-test
> * ltp-cap_bounds
> * ltp-commands
> * ltp-containers
> * ltp-controllers
> * ltp-cpuhotplug
> * ltp-crypto
> * ltp-cve
> * ltp-dio
> * ltp-fcntl-locktests
> * ltp-filecaps
> * ltp-fs
> * ltp-fs_bind
> * ltp-fs_perms_simple
> * ltp-fsx
> * ltp-hugetlb
> * ltp-io
> * ltp-ipc
> * ltp-math
> * ltp-mm
> * ltp-nptl
> * ltp-pty
> * ltp-sched
> * ltp-securebits
> * ltp-smoke
> * ltp-syscalls
> * ltp-tracing
> * network-basic-tests
> * perf
> * rcutorture
> * v4l2-compliance
>
> --
> Linaro LKFT
> https://lkft.linaro.org

2023-10-06 16:48:16

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH 5.15 000/183] 5.15.134-rc1 review

On Fri, Oct 06, 2023 at 12:20:38PM -0400, Liam R. Howlett wrote:
> * Naresh Kamboju <[email protected]> [231005 13:49]:
> > On Wed, 4 Oct 2023 at 23:33, Greg Kroah-Hartman
> > <[email protected]> wrote:
> > >
> > > This is the start of the stable review cycle for the 5.15.134 release.
> > > There are 183 patches in this series, all will be posted as a response
> > > to this one. If anyone has any issues with these being applied, please
> > > let me know.
> > >
> > > Responses should be made by Fri, 06 Oct 2023 17:51:12 +0000.
> > > Anything received after that time might be too late.
> > >
> > > The whole patch series can be found in one patch at:
> > > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.134-rc1.gz
> > > or in the git tree and branch at:
> > > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
> > > and the diffstat can be found below.
> > >
> > > thanks,
> > >
> > > greg k-h
> >
> > Results from Linaro’s test farm.
> > Regressions on x86.
> >
> > Following kernel warning noticed on x86 while booting stable-rc 5.15.134-rc1
> > with selftest merge config built kernel.
> >
> > Reported-by: Linux Kernel Functional Testing <[email protected]>
> >
> > Anyone noticed this kernel warning ?
> >
> > This is always reproducible while booting x86 with a given config.
>
> >From that config:
> #
> # RCU Subsystem
> #
> CONFIG_TREE_RCU=y
> # CONFIG_RCU_EXPERT is not set
> CONFIG_SRCU=y
> CONFIG_TREE_SRCU=y
> CONFIG_TASKS_RCU_GENERIC=y
> CONFIG_TASKS_RUDE_RCU=y
> CONFIG_TASKS_TRACE_RCU=y
> CONFIG_RCU_STALL_COMMON=y
> CONFIG_RCU_NEED_SEGCBLIST=y
> # end of RCU Subsystem
>
> #
> # RCU Debugging
> #
> CONFIG_PROVE_RCU=y
> # CONFIG_RCU_SCALE_TEST is not set
> # CONFIG_RCU_TORTURE_TEST is not set
> # CONFIG_RCU_REF_SCALE_TEST is not set
> CONFIG_RCU_CPU_STALL_TIMEOUT=21
> CONFIG_RCU_TRACE=y
> # CONFIG_RCU_EQS_DEBUG is not set
> # end of RCU Debugging
>
>
> >
> > x86 boot log:
> > -----
> > [ 0.000000] Linux version 5.15.134-rc1 (tuxmake@tuxmake)
> > (x86_64-linux-gnu-gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils
> > for Debian) 2.40) #1 SMP @1696443178
> > ...
> > [ 1.480701] ------------[ cut here ]------------
> > [ 1.481296] WARNING: CPU: 0 PID: 13 at kernel/rcu/tasks.h:958
> > trc_inspect_reader+0x80/0xb0
> > [ 1.481296] Modules linked in:
> > [ 1.481296] CPU: 0 PID: 13 Comm: rcu_tasks_trace Not tainted 5.15.134-rc1 #1
> > [ 1.481296] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
> > 2.5 11/26/2020
> > [ 1.481296] RIP: 0010:trc_inspect_reader+0x80/0xb0
>
> This function has changed a lot, including the dropping of this
> WARN_ON_ONCE(). The warning was replaced in 897ba84dc5aa ("rcu-tasks:
> Handle idle tasks for recently offlined CPUs") with something that looks
> equivalent so I'm not sure why it would not trigger in newer revisions.
>
> Obviously the behaviour I changed was the test for the task being idle.
> I am not sure how best to short-circuit that test from happening during
> boot as I am not familiar with the RCU code.

The usual test for RCU's notion of early boot being completed is
(rcu_scheduler_active != RCU_SCHEDULER_INIT).

Except that "ofl" should always be false that early in boot, at least
in mainline.

> It's also worth noting that the bug this fixes wasn't exposed until the
> maple tree (added in v6.1) was used for the IRQ descriptors (added in
> v6.5).

Lots of latent bugs, to be sure, even with rcutorture. :-/

Thanx, Paul

> > [ 1.481296] Code: b6 83 45 04 00 00 84 c0 75 48 c6 83 45 04 00 00
> > 01 b8 01 00 00 00 5b 41 5c 5d c3 cc cc cc cc 0f 94 c0 eb b4 f6 43 2c
> > 02 75 02 <0f> 0b 48 83 05 36 f8 ee 02 01 b8 01 00 00 00 48 83 05 21 f8
> > ee 02
> > [ 1.481296] RSP: 0000:ffffb25e000afd70 EFLAGS: 00010046
> > [ 1.481296] RAX: 0000000000000000 RBX: ffff9b40c080d040 RCX: 0000000000000003
> > [ 1.481296] RDX: ffff9b4427b80000 RSI: 0000000000000000 RDI: ffff9b40c080d040
> > [ 1.481296] RBP: ffffb25e000afd80 R08: e32db91cdfdc3bef R09: 00000000035b89d4
> > [ 1.481296] R10: 000000006a495065 R11: 0000000000000030 R12: ffffffffae692100
> > [ 1.481296] R13: 0000000000000000 R14: ffff9b40c080d9a8 R15: 0000000000000000
> > [ 1.481296] FS: 0000000000000000(0000) GS:ffff9b4427a00000(0000)
> > knlGS:0000000000000000
> > [ 1.481296] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 1.481296] CR2: ffff9b4297201000 CR3: 00000002d5e26001 CR4: 00000000003706f0
> > [ 1.481296] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [ 1.481296] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [ 1.481296] Call Trace:
> > [ 1.481296] <TASK>
> > [ 1.481296] ? show_regs.cold+0x1a/0x1f
> > [ 1.481296] ? __warn+0x88/0x120
> > [ 1.481296] ? trc_inspect_reader+0x80/0xb0
> > [ 1.481296] ? report_bug+0xa8/0xd0
> > [ 1.481296] ? handle_bug+0x40/0x70
> > [ 1.481296] ? exc_invalid_op+0x18/0x70
> > [ 1.481296] ? asm_exc_invalid_op+0x1b/0x20
> > [ 1.481296] ? rcu_tasks_kthread+0x250/0x250
> > [ 1.481296] ? trc_inspect_reader+0x80/0xb0
> > [ 1.481296] ? rcu_tasks_kthread+0x250/0x250
> > [ 1.481296] try_invoke_on_locked_down_task+0x109/0x120
> > [ 1.481296] trc_wait_for_one_reader.part.0+0x48/0x270
> > [ 1.481296] rcu_tasks_trace_postscan+0x76/0xb0
> > [ 1.481296] rcu_tasks_wait_gp+0x186/0x380
> > [ 1.481296] ? _raw_spin_unlock_irqrestore+0x35/0x50
> > [ 1.481296] rcu_tasks_kthread+0x145/0x250
> > [ 1.481296] ? do_wait_intr_irq+0xc0/0xc0
> > [ 1.481296] ? synchronize_rcu_tasks_rude+0x20/0x20
> > [ 1.481296] kthread+0x146/0x170
> > [ 1.481296] ? set_kthread_struct+0x50/0x50
> > [ 1.481296] ret_from_fork+0x1f/0x30
> > [ 1.481296] </TASK>
> > [ 1.481296] irq event stamp: 132
> > [ 1.481296] hardirqs last enabled at (131): [<ffffffffaf7936a5>]
> > _raw_spin_unlock_irqrestore+0x35/0x50
> > [ 1.481296] hardirqs last disabled at (132): [<ffffffffaf79345b>]
> > _raw_spin_lock_irqsave+0x5b/0x60
> > [ 1.481296] softirqs last enabled at (54): [<ffffffffae69201c>]
> > rcu_tasks_kthread+0x16c/0x250
> > [ 1.481296] softirqs last disabled at (50): [<ffffffffae69201c>]
> > rcu_tasks_kthread+0x16c/0x250
> > [ 1.481296] ---[ end trace 5a00c61d8412a9ac ]---
> >
> >
> > Links:
> > ----
> > - https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.15.y/build/v5.15.133-184-g6f28ecf24aef/testrun/20260259/suite/log-parser-boot/test/check-kernel-exception/log
> > - https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.15.y/build/v5.15.133-184-g6f28ecf24aef/testrun/20260259/suite/log-parser-boot/tests/
> > Build: https://storage.tuxsuite.com/public/linaro/lkft/builds/2WJFhcfqqG69pqj6LWuI14kVoP5/
> >
> > steps to reproduce:
> > --------
> > - https://storage.tuxsuite.com/public/linaro/lkft/builds/2WJFhcfqqG69pqj6LWuI14kVoP5/tuxmake_reproducer.sh
> >
> > ## Build
> > * kernel: 5.15.134-rc1
> > * git: https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc
> > * git branch: linux-5.15.y
> > * git commit: 6f28ecf24aef2896f4071dc6268d3fb5f8259c77
> > * git describe: v5.15.133-184-g6f28ecf24aef
> > * test details:
> > https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.15.y/build/v5.15.133-184-g6f28ecf24aef
> >
> > ## Test Regressions (compared to v5.15.133)
> > * x86, log-parser-boot
> > - check-kernel-exception
> > - check-kernel-warning
> >
> > * x86, log-parser-test
> > - check-kernel-exception
> > - check-kernel-warning
> >
> >
> > ## Metric Regressions (compared to v5.15.133)
> >
> > ## Test Fixes (compared to v5.15.133)
> >
> > ## Metric Fixes (compared to v5.15.133)
> >
> > ## Test result summary
> > total: 90392, pass: 71514, fail: 2557, skip: 16224, xfail: 97
> >
> > ## Build Summary
> > * arc: 4 total, 4 passed, 0 failed
> > * arm: 114 total, 114 passed, 0 failed
> > * arm64: 42 total, 42 passed, 0 failed
> > * i386: 32 total, 31 passed, 1 failed
> > * mips: 27 total, 26 passed, 1 failed
> > * parisc: 4 total, 4 passed, 0 failed
> > * powerpc: 26 total, 25 passed, 1 failed
> > * riscv: 11 total, 11 passed, 0 failed
> > * s390: 12 total, 11 passed, 1 failed
> > * sh: 13 total, 11 passed, 2 failed
> > * sparc: 8 total, 8 passed, 0 failed
> > * x86_64: 38 total, 38 passed, 0 failed
> >
> > ## Test suites summary
> > * boot
> > * kselftest-android
> > * kselftest-arm64
> > * kselftest-breakpoints
> > * kselftest-capabilities
> > * kselftest-cgroup
> > * kselftest-clone3
> > * kselftest-core
> > * kselftest-cpu-hotplug
> > * kselftest-cpufreq
> > * kselftest-drivers-dma-buf
> > * kselftest-efivarfs
> > * kselftest-exec
> > * kselftest-filesystems
> > * kselftest-filesystems-binderfs
> > * kselftest-filesystems-epoll
> > * kselftest-firmware
> > * kselftest-fpu
> > * kselftest-ftrace
> > * kselftest-futex
> > * kselftest-gpio
> > * kselftest-intel_pstate
> > * kselftest-ipc
> > * kselftest-ir
> > * kselftest-kcmp
> > * kselftest-kexec
> > * kselftest-kvm
> > * kselftest-lib
> > * kselftest-membarrier
> > * kselftest-memfd
> > * kselftest-memory-hotplug
> > * kselftest-mincore
> > * kselftest-mount
> > * kselftest-mqueue
> > * kselftest-net
> > * kselftest-net-forwarding
> > * kselftest-net-mptcp
> > * kselftest-netfilter
> > * kselftest-nsfs
> > * kselftest-openat2
> > * kselftest-pid_namespace
> > * kselftest-pidfd
> > * kselftest-proc
> > * kselftest-pstore
> > * kselftest-ptrace
> > * kselftest-rseq
> > * kselftest-rtc
> > * kselftest-seccomp
> > * kselftest-sigaltstack
> > * kselftest-size
> > * kselftest-splice
> > * kselftest-static_keys
> > * kselftest-sync
> > * kselftest-sysctl
> > * kselftest-tc-testing
> > * kselftest-timens
> > * kselftest-tmpfs
> > * kselftest-tpm2
> > * kselftest-user
> > * kselftest-user_events
> > * kselftest-vDSO
> > * kselftest-vm
> > * kselftest-watchdog
> > * kselftest-x86
> > * kselftest-zram
> > * kunit
> > * kvm-unit-tests
> > * libgpiod
> > * log-parser-boot
> > * log-parser-test
> > * ltp-cap_bounds
> > * ltp-commands
> > * ltp-containers
> > * ltp-controllers
> > * ltp-cpuhotplug
> > * ltp-crypto
> > * ltp-cve
> > * ltp-dio
> > * ltp-fcntl-locktests
> > * ltp-filecaps
> > * ltp-fs
> > * ltp-fs_bind
> > * ltp-fs_perms_simple
> > * ltp-fsx
> > * ltp-hugetlb
> > * ltp-io
> > * ltp-ipc
> > * ltp-math
> > * ltp-mm
> > * ltp-nptl
> > * ltp-pty
> > * ltp-sched
> > * ltp-securebits
> > * ltp-smoke
> > * ltp-syscalls
> > * ltp-tracing
> > * network-basic-tests
> > * perf
> > * rcutorture
> > * v4l2-compliance
> >
> > --
> > Linaro LKFT
> > https://lkft.linaro.org

2023-10-06 17:23:35

by Florian Fainelli

[permalink] [raw]
Subject: Re: [PATCH 5.15 000/183] 5.15.134-rc1 review

On 10/6/23 05:15, Sasha Levin wrote:
> On Fri, Oct 06, 2023 at 01:03:30PM +0200, Greg Kroah-Hartman wrote:
>> On Fri, Oct 06, 2023 at 04:07:14PM +0530, Harshit Mogalapalli wrote:
>>>
>>>
>>> On 06/10/23 3:55 pm, Greg Kroah-Hartman wrote:
>>> > On Wed, Oct 04, 2023 at 11:43:46AM -0700, Florian Fainelli wrote:
>>> > > On 10/4/23 10:53, Greg Kroah-Hartman wrote:
>>> > > > This is the start of the stable review cycle for the 5.15.134
>>> release.
>>> > > > There are 183 patches in this series, all will be posted as a
>>> response
>>> > > > to this one.  If anyone has any issues with these being
>>> applied, please
>>> > > > let me know.
>>> > > >
>>> > > > Responses should be made by Fri, 06 Oct 2023 17:51:12 +0000.
>>> > > > Anything received after that time might be too late.
>>> > > >
>>> > > > The whole patch series can be found in one patch at:
>>> > > >
>>> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.134-rc1.gz
>>> > > > or in the git tree and branch at:
>>> > > >
>>> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
>>> > > > and the diffstat can be found below.
>>> > > >
>>> > > > thanks,
>>> > > >
>>> > > > greg k-h
>>> > >
>>> > > perf fails to build with:
>>> > >
>>> > >    CC
>>> /local/users/fainelli/buildroot/output/arm/build/linux-custom/tools/perf/util/metricgroup.o
>>> > > util/metricgroup.c: In function 'metricgroup__parse_groups':
>>> > > util/metricgroup.c:1261:7: error: 'table' undeclared (first use
>>> in this
>>> > > function)
>>> > >    if (!table)
>>> > >         ^~~~~
>>> > > util/metricgroup.c:1261:7: note: each undeclared identifier is
>>> reported only
>>> > > once for each function it appears in
>>> > > make[6]: ***
>>> [/local/users/fainelli/buildroot/output/arm/build/linux-custom/tools/build/Makefile.build:97: /local/users/fainelli/buildroot/output/arm/build/linux-custom/tools/perf/util/metricgroup.o]
>>> > > Error 1
>>> > >
>>> > > caused by c1ef510a0f2a879bf29ddebae766ec9f0790eb8f ("perf metric:
>>> Return
>>> > > early if no CPU PMU table exists"). Dropping this commit allows
>>> the build to
>>> > > continue.
>>> > >
>>> > > I had reported in the previous cycle that 00facc760903be66 ("perf
>>> jevents:
>>> > > Switch build to use jevents.py") was causing build failures:
>>> > >
>>> > >
>>> https://lore.kernel.org/all/[email protected]/
>>> > >
>>> > > do we still want these commits to be included?
>>> >
>>> > No, I'll go drop them now, thanks for the report.
>>>
>>> Thought:
>>> It's not the first time we see build failures in tools/perf -- would
>>> it make
>>> sense to add this to your own build tests to reduce the round trip
>>> time for
>>> these errors ?
>>
>> Last time I tried to build perf, I couldn't do it at all so I just gave
>> up trying to test for it :)
>
> Same... I've also removed perf from AUTOSEL for that reason.

I suppose that is fair, if there is a critical bug in perf, we could
submit it "manually" to ensure it reaches the stable trees. Probably
better managed that way. Thanks!
--
Florian

2023-10-06 17:58:43

by Liam R. Howlett

[permalink] [raw]
Subject: Re: [PATCH 5.15 000/183] 5.15.134-rc1 review

* Paul E. McKenney <[email protected]> [231006 12:47]:
> On Fri, Oct 06, 2023 at 12:20:38PM -0400, Liam R. Howlett wrote:
> > * Naresh Kamboju <[email protected]> [231005 13:49]:
> > > On Wed, 4 Oct 2023 at 23:33, Greg Kroah-Hartman
> > > <[email protected]> wrote:
> > > >
> > > > This is the start of the stable review cycle for the 5.15.134 release.
> > > > There are 183 patches in this series, all will be posted as a response
> > > > to this one. If anyone has any issues with these being applied, please
> > > > let me know.
> > > >
> > > > Responses should be made by Fri, 06 Oct 2023 17:51:12 +0000.
> > > > Anything received after that time might be too late.
> > > >
> > > > The whole patch series can be found in one patch at:
> > > > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.134-rc1.gz
> > > > or in the git tree and branch at:
> > > > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
> > > > and the diffstat can be found below.
> > > >
> > > > thanks,
> > > >
> > > > greg k-h
> > >
> > > Results from Linaro’s test farm.
> > > Regressions on x86.
> > >
> > > Following kernel warning noticed on x86 while booting stable-rc 5.15.134-rc1
> > > with selftest merge config built kernel.
> > >
> > > Reported-by: Linux Kernel Functional Testing <[email protected]>
> > >
> > > Anyone noticed this kernel warning ?
> > >
> > > This is always reproducible while booting x86 with a given config.
> >
> > >From that config:
> > #
> > # RCU Subsystem
> > #
> > CONFIG_TREE_RCU=y
> > # CONFIG_RCU_EXPERT is not set
> > CONFIG_SRCU=y
> > CONFIG_TREE_SRCU=y
> > CONFIG_TASKS_RCU_GENERIC=y
> > CONFIG_TASKS_RUDE_RCU=y
> > CONFIG_TASKS_TRACE_RCU=y
> > CONFIG_RCU_STALL_COMMON=y
> > CONFIG_RCU_NEED_SEGCBLIST=y
> > # end of RCU Subsystem
> >
> > #
> > # RCU Debugging
> > #
> > CONFIG_PROVE_RCU=y
> > # CONFIG_RCU_SCALE_TEST is not set
> > # CONFIG_RCU_TORTURE_TEST is not set
> > # CONFIG_RCU_REF_SCALE_TEST is not set
> > CONFIG_RCU_CPU_STALL_TIMEOUT=21
> > CONFIG_RCU_TRACE=y
> > # CONFIG_RCU_EQS_DEBUG is not set
> > # end of RCU Debugging
> >
> >
> > >
> > > x86 boot log:
> > > -----
> > > [ 0.000000] Linux version 5.15.134-rc1 (tuxmake@tuxmake)
> > > (x86_64-linux-gnu-gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils
> > > for Debian) 2.40) #1 SMP @1696443178
> > > ...
> > > [ 1.480701] ------------[ cut here ]------------
> > > [ 1.481296] WARNING: CPU: 0 PID: 13 at kernel/rcu/tasks.h:958
> > > trc_inspect_reader+0x80/0xb0
> > > [ 1.481296] Modules linked in:
> > > [ 1.481296] CPU: 0 PID: 13 Comm: rcu_tasks_trace Not tainted 5.15.134-rc1 #1
> > > [ 1.481296] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
> > > 2.5 11/26/2020
> > > [ 1.481296] RIP: 0010:trc_inspect_reader+0x80/0xb0
> >
> > This function has changed a lot, including the dropping of this
> > WARN_ON_ONCE(). The warning was replaced in 897ba84dc5aa ("rcu-tasks:
> > Handle idle tasks for recently offlined CPUs") with something that looks
> > equivalent so I'm not sure why it would not trigger in newer revisions.
> >
> > Obviously the behaviour I changed was the test for the task being idle.
> > I am not sure how best to short-circuit that test from happening during
> > boot as I am not familiar with the RCU code.
>
> The usual test for RCU's notion of early boot being completed is
> (rcu_scheduler_active != RCU_SCHEDULER_INIT).
>
> Except that "ofl" should always be false that early in boot, at least
> in mainline.

Is this still true in the final version of the patch where we set the
boot task as !idle until just before the early boot is finished? I
wouldn't think of this as 'early in boot' anymore as much as the entire
kernel setup. Maybe we need to shorten the time we stay in !idle mode
for earlier kernels?

How frequent is this function called? We could check something for
early boot... or track down where the cpu is put online and restore idle
before that happens?

>
> > It's also worth noting that the bug this fixes wasn't exposed until the
> > maple tree (added in v6.1) was used for the IRQ descriptors (added in
> > v6.5).
>
> Lots of latent bugs, to be sure, even with rcutorture. :-/

The Right Thing is to fix the bug all the way back to the introduction,
but what fallout makes the backport less desirable than living with the
unexposed bug?


Thanks,
Liam

2023-10-06 18:22:01

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH 5.15 000/183] 5.15.134-rc1 review

On Fri, Oct 06, 2023 at 01:57:14PM -0400, Liam R. Howlett wrote:
> * Paul E. McKenney <[email protected]> [231006 12:47]:
> > On Fri, Oct 06, 2023 at 12:20:38PM -0400, Liam R. Howlett wrote:
> > > * Naresh Kamboju <[email protected]> [231005 13:49]:
> > > > On Wed, 4 Oct 2023 at 23:33, Greg Kroah-Hartman
> > > > <[email protected]> wrote:
> > > > >
> > > > > This is the start of the stable review cycle for the 5.15.134 release.
> > > > > There are 183 patches in this series, all will be posted as a response
> > > > > to this one. If anyone has any issues with these being applied, please
> > > > > let me know.
> > > > >
> > > > > Responses should be made by Fri, 06 Oct 2023 17:51:12 +0000.
> > > > > Anything received after that time might be too late.
> > > > >
> > > > > The whole patch series can be found in one patch at:
> > > > > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.134-rc1.gz
> > > > > or in the git tree and branch at:
> > > > > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
> > > > > and the diffstat can be found below.
> > > > >
> > > > > thanks,
> > > > >
> > > > > greg k-h
> > > >
> > > > Results from Linaro’s test farm.
> > > > Regressions on x86.
> > > >
> > > > Following kernel warning noticed on x86 while booting stable-rc 5.15.134-rc1
> > > > with selftest merge config built kernel.
> > > >
> > > > Reported-by: Linux Kernel Functional Testing <[email protected]>
> > > >
> > > > Anyone noticed this kernel warning ?
> > > >
> > > > This is always reproducible while booting x86 with a given config.
> > >
> > > >From that config:
> > > #
> > > # RCU Subsystem
> > > #
> > > CONFIG_TREE_RCU=y
> > > # CONFIG_RCU_EXPERT is not set
> > > CONFIG_SRCU=y
> > > CONFIG_TREE_SRCU=y
> > > CONFIG_TASKS_RCU_GENERIC=y
> > > CONFIG_TASKS_RUDE_RCU=y
> > > CONFIG_TASKS_TRACE_RCU=y
> > > CONFIG_RCU_STALL_COMMON=y
> > > CONFIG_RCU_NEED_SEGCBLIST=y
> > > # end of RCU Subsystem
> > >
> > > #
> > > # RCU Debugging
> > > #
> > > CONFIG_PROVE_RCU=y
> > > # CONFIG_RCU_SCALE_TEST is not set
> > > # CONFIG_RCU_TORTURE_TEST is not set
> > > # CONFIG_RCU_REF_SCALE_TEST is not set
> > > CONFIG_RCU_CPU_STALL_TIMEOUT=21
> > > CONFIG_RCU_TRACE=y
> > > # CONFIG_RCU_EQS_DEBUG is not set
> > > # end of RCU Debugging
> > >
> > >
> > > >
> > > > x86 boot log:
> > > > -----
> > > > [ 0.000000] Linux version 5.15.134-rc1 (tuxmake@tuxmake)
> > > > (x86_64-linux-gnu-gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils
> > > > for Debian) 2.40) #1 SMP @1696443178
> > > > ...
> > > > [ 1.480701] ------------[ cut here ]------------
> > > > [ 1.481296] WARNING: CPU: 0 PID: 13 at kernel/rcu/tasks.h:958
> > > > trc_inspect_reader+0x80/0xb0
> > > > [ 1.481296] Modules linked in:
> > > > [ 1.481296] CPU: 0 PID: 13 Comm: rcu_tasks_trace Not tainted 5.15.134-rc1 #1
> > > > [ 1.481296] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
> > > > 2.5 11/26/2020
> > > > [ 1.481296] RIP: 0010:trc_inspect_reader+0x80/0xb0
> > >
> > > This function has changed a lot, including the dropping of this
> > > WARN_ON_ONCE(). The warning was replaced in 897ba84dc5aa ("rcu-tasks:
> > > Handle idle tasks for recently offlined CPUs") with something that looks
> > > equivalent so I'm not sure why it would not trigger in newer revisions.
> > >
> > > Obviously the behaviour I changed was the test for the task being idle.
> > > I am not sure how best to short-circuit that test from happening during
> > > boot as I am not familiar with the RCU code.
> >
> > The usual test for RCU's notion of early boot being completed is
> > (rcu_scheduler_active != RCU_SCHEDULER_INIT).
> >
> > Except that "ofl" should always be false that early in boot, at least
> > in mainline.
>
> Is this still true in the final version of the patch where we set the
> boot task as !idle until just before the early boot is finished? I
> wouldn't think of this as 'early in boot' anymore as much as the entire
> kernel setup. Maybe we need to shorten the time we stay in !idle mode
> for earlier kernels?

In mainline, the ofl variable is defined as cpu_is_offline(cpu), and
during boot, the boot CPU is guaranteed to be online. (As opposed to
the boot CPU's idle-task state.)

> How frequent is this function called? We could check something for
> early boot... or track down where the cpu is put online and restore idle
> before that happens?

Once per RCU Tasks Trace grace period per reader seen to be blocking
that grace period. Its performance is as issue, but not to anywhere
near the same extent as (say) rcu_read_lock_trace().

> > > It's also worth noting that the bug this fixes wasn't exposed until the
> > > maple tree (added in v6.1) was used for the IRQ descriptors (added in
> > > v6.5).
> >
> > Lots of latent bugs, to be sure, even with rcutorture. :-/
>
> The Right Thing is to fix the bug all the way back to the introduction,
> but what fallout makes the backport less desirable than living with the
> unexposed bug?

You are quite right that it is possible for the risk of a backport to
exceed the risk of the original bug.

I defer to Joel (CCed) on how best to resolve this in -stable.

Thanx, Paul

2023-10-08 01:23:29

by Joel Fernandes

[permalink] [raw]
Subject: Re: [PATCH 5.15 000/183] 5.15.134-rc1 review

On Fri, Oct 6, 2023 at 2:20 PM Paul E. McKenney <[email protected]> wrote:
>
> On Fri, Oct 06, 2023 at 01:57:14PM -0400, Liam R. Howlett wrote:
> > * Paul E. McKenney <[email protected]> [231006 12:47]:
> > > On Fri, Oct 06, 2023 at 12:20:38PM -0400, Liam R. Howlett wrote:
> > > > * Naresh Kamboju <[email protected]> [231005 13:49]:
> > > > > On Wed, 4 Oct 2023 at 23:33, Greg Kroah-Hartman
> > > > > <[email protected]> wrote:
> > > > > >
> > > > > > This is the start of the stable review cycle for the 5.15.134 release.
> > > > > > There are 183 patches in this series, all will be posted as a response
> > > > > > to this one. If anyone has any issues with these being applied, please
> > > > > > let me know.
> > > > > >
> > > > > > Responses should be made by Fri, 06 Oct 2023 17:51:12 +0000.
> > > > > > Anything received after that time might be too late.
> > > > > >
> > > > > > The whole patch series can be found in one patch at:
> > > > > > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.134-rc1.gz
> > > > > > or in the git tree and branch at:
> > > > > > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
> > > > > > and the diffstat can be found below.
> > > > > >
> > > > > > thanks,
> > > > > >
> > > > > > greg k-h
> > > > >
> > > > > Results from Linaro’s test farm.
> > > > > Regressions on x86.
> > > > >
> > > > > Following kernel warning noticed on x86 while booting stable-rc 5.15.134-rc1
> > > > > with selftest merge config built kernel.
> > > > >
> > > > > Reported-by: Linux Kernel Functional Testing <[email protected]>
> > > > >
> > > > > Anyone noticed this kernel warning ?
> > > > >
> > > > > This is always reproducible while booting x86 with a given config.
> > > >
> > > > >From that config:
> > > > #
> > > > # RCU Subsystem
> > > > #
> > > > CONFIG_TREE_RCU=y
> > > > # CONFIG_RCU_EXPERT is not set
> > > > CONFIG_SRCU=y
> > > > CONFIG_TREE_SRCU=y
> > > > CONFIG_TASKS_RCU_GENERIC=y
> > > > CONFIG_TASKS_RUDE_RCU=y
> > > > CONFIG_TASKS_TRACE_RCU=y
> > > > CONFIG_RCU_STALL_COMMON=y
> > > > CONFIG_RCU_NEED_SEGCBLIST=y
> > > > # end of RCU Subsystem
> > > >
> > > > #
> > > > # RCU Debugging
> > > > #
> > > > CONFIG_PROVE_RCU=y
> > > > # CONFIG_RCU_SCALE_TEST is not set
> > > > # CONFIG_RCU_TORTURE_TEST is not set
> > > > # CONFIG_RCU_REF_SCALE_TEST is not set
> > > > CONFIG_RCU_CPU_STALL_TIMEOUT=21
> > > > CONFIG_RCU_TRACE=y
> > > > # CONFIG_RCU_EQS_DEBUG is not set
> > > > # end of RCU Debugging
> > > >
> > > >
> > > > >
> > > > > x86 boot log:
> > > > > -----
> > > > > [ 0.000000] Linux version 5.15.134-rc1 (tuxmake@tuxmake)
> > > > > (x86_64-linux-gnu-gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils
> > > > > for Debian) 2.40) #1 SMP @1696443178
> > > > > ...
> > > > > [ 1.480701] ------------[ cut here ]------------
> > > > > [ 1.481296] WARNING: CPU: 0 PID: 13 at kernel/rcu/tasks.h:958
> > > > > trc_inspect_reader+0x80/0xb0
> > > > > [ 1.481296] Modules linked in:
> > > > > [ 1.481296] CPU: 0 PID: 13 Comm: rcu_tasks_trace Not tainted 5.15.134-rc1 #1
> > > > > [ 1.481296] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
> > > > > 2.5 11/26/2020
> > > > > [ 1.481296] RIP: 0010:trc_inspect_reader+0x80/0xb0
> > > >
> > > > This function has changed a lot, including the dropping of this
> > > > WARN_ON_ONCE(). The warning was replaced in 897ba84dc5aa ("rcu-tasks:
> > > > Handle idle tasks for recently offlined CPUs") with something that looks
> > > > equivalent so I'm not sure why it would not trigger in newer revisions.
> > > >
> > > > Obviously the behaviour I changed was the test for the task being idle.
> > > > I am not sure how best to short-circuit that test from happening during
> > > > boot as I am not familiar with the RCU code.
> > >
> > > The usual test for RCU's notion of early boot being completed is
> > > (rcu_scheduler_active != RCU_SCHEDULER_INIT).
> > >
> > > Except that "ofl" should always be false that early in boot, at least
> > > in mainline.
> >
> > Is this still true in the final version of the patch where we set the
> > boot task as !idle until just before the early boot is finished? I
> > wouldn't think of this as 'early in boot' anymore as much as the entire
> > kernel setup. Maybe we need to shorten the time we stay in !idle mode
> > for earlier kernels?
>
> In mainline, the ofl variable is defined as cpu_is_offline(cpu), and
> during boot, the boot CPU is guaranteed to be online. (As opposed to
> the boot CPU's idle-task state.)
>
> > How frequent is this function called? We could check something for
> > early boot... or track down where the cpu is put online and restore idle
> > before that happens?
>
> Once per RCU Tasks Trace grace period per reader seen to be blocking
> that grace period. Its performance is as issue, but not to anywhere
> near the same extent as (say) rcu_read_lock_trace().
>
> > > > It's also worth noting that the bug this fixes wasn't exposed until the
> > > > maple tree (added in v6.1) was used for the IRQ descriptors (added in
> > > > v6.5).
> > >
> > > Lots of latent bugs, to be sure, even with rcutorture. :-/
> >
> > The Right Thing is to fix the bug all the way back to the introduction,
> > but what fallout makes the backport less desirable than living with the
> > unexposed bug?
>
> You are quite right that it is possible for the risk of a backport to
> exceed the risk of the original bug.
>
> I defer to Joel (CCed) on how best to resolve this in -stable.

Maybe I am missing something but this issue should also be happening
in mainline right?

Even though mainline has 897ba84dc5aa ("rcu-tasks: Handle idle tasks
for recently offlined CPUs") , the warning should still be happening
due to Liam's "kernel/sched: Modify initial boot task idle setup"
because the warning is just rearranged a bit but essentially the same.

IMHO, the right thing to do then is to drop Liam's patch from 5.15 and
fix it in mainline (using the ideas described in this thread), then
backport both that new fix and Liam's patch to 5.15.

Or is there a reason this warning does not show up on the mainline?

My impression is that dropping Liam's patch for the stable release and
revisiting it later is a better approach since tiny RCU is used way
less in the wild than tree/tasks RCU. Thoughts?

thanks,

- Joel

2023-10-09 01:21:19

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH 5.15 000/183] 5.15.134-rc1 review

On Sat, Oct 07, 2023 at 09:22:55PM -0400, Joel Fernandes wrote:
> On Fri, Oct 6, 2023 at 2:20 PM Paul E. McKenney <[email protected]> wrote:
> >
> > On Fri, Oct 06, 2023 at 01:57:14PM -0400, Liam R. Howlett wrote:
> > > * Paul E. McKenney <[email protected]> [231006 12:47]:
> > > > On Fri, Oct 06, 2023 at 12:20:38PM -0400, Liam R. Howlett wrote:
> > > > > * Naresh Kamboju <[email protected]> [231005 13:49]:
> > > > > > On Wed, 4 Oct 2023 at 23:33, Greg Kroah-Hartman
> > > > > > <[email protected]> wrote:
> > > > > > >
> > > > > > > This is the start of the stable review cycle for the 5.15.134 release.
> > > > > > > There are 183 patches in this series, all will be posted as a response
> > > > > > > to this one. If anyone has any issues with these being applied, please
> > > > > > > let me know.
> > > > > > >
> > > > > > > Responses should be made by Fri, 06 Oct 2023 17:51:12 +0000.
> > > > > > > Anything received after that time might be too late.
> > > > > > >
> > > > > > > The whole patch series can be found in one patch at:
> > > > > > > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.134-rc1.gz
> > > > > > > or in the git tree and branch at:
> > > > > > > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
> > > > > > > and the diffstat can be found below.
> > > > > > >
> > > > > > > thanks,
> > > > > > >
> > > > > > > greg k-h
> > > > > >
> > > > > > Results from Linaro’s test farm.
> > > > > > Regressions on x86.
> > > > > >
> > > > > > Following kernel warning noticed on x86 while booting stable-rc 5.15.134-rc1
> > > > > > with selftest merge config built kernel.
> > > > > >
> > > > > > Reported-by: Linux Kernel Functional Testing <[email protected]>
> > > > > >
> > > > > > Anyone noticed this kernel warning ?
> > > > > >
> > > > > > This is always reproducible while booting x86 with a given config.
> > > > >
> > > > > >From that config:
> > > > > #
> > > > > # RCU Subsystem
> > > > > #
> > > > > CONFIG_TREE_RCU=y
> > > > > # CONFIG_RCU_EXPERT is not set
> > > > > CONFIG_SRCU=y
> > > > > CONFIG_TREE_SRCU=y
> > > > > CONFIG_TASKS_RCU_GENERIC=y
> > > > > CONFIG_TASKS_RUDE_RCU=y
> > > > > CONFIG_TASKS_TRACE_RCU=y
> > > > > CONFIG_RCU_STALL_COMMON=y
> > > > > CONFIG_RCU_NEED_SEGCBLIST=y
> > > > > # end of RCU Subsystem
> > > > >
> > > > > #
> > > > > # RCU Debugging
> > > > > #
> > > > > CONFIG_PROVE_RCU=y
> > > > > # CONFIG_RCU_SCALE_TEST is not set
> > > > > # CONFIG_RCU_TORTURE_TEST is not set
> > > > > # CONFIG_RCU_REF_SCALE_TEST is not set
> > > > > CONFIG_RCU_CPU_STALL_TIMEOUT=21
> > > > > CONFIG_RCU_TRACE=y
> > > > > # CONFIG_RCU_EQS_DEBUG is not set
> > > > > # end of RCU Debugging
> > > > >
> > > > >
> > > > > >
> > > > > > x86 boot log:
> > > > > > -----
> > > > > > [ 0.000000] Linux version 5.15.134-rc1 (tuxmake@tuxmake)
> > > > > > (x86_64-linux-gnu-gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils
> > > > > > for Debian) 2.40) #1 SMP @1696443178
> > > > > > ...
> > > > > > [ 1.480701] ------------[ cut here ]------------
> > > > > > [ 1.481296] WARNING: CPU: 0 PID: 13 at kernel/rcu/tasks.h:958
> > > > > > trc_inspect_reader+0x80/0xb0
> > > > > > [ 1.481296] Modules linked in:
> > > > > > [ 1.481296] CPU: 0 PID: 13 Comm: rcu_tasks_trace Not tainted 5.15.134-rc1 #1
> > > > > > [ 1.481296] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
> > > > > > 2.5 11/26/2020
> > > > > > [ 1.481296] RIP: 0010:trc_inspect_reader+0x80/0xb0
> > > > >
> > > > > This function has changed a lot, including the dropping of this
> > > > > WARN_ON_ONCE(). The warning was replaced in 897ba84dc5aa ("rcu-tasks:
> > > > > Handle idle tasks for recently offlined CPUs") with something that looks
> > > > > equivalent so I'm not sure why it would not trigger in newer revisions.
> > > > >
> > > > > Obviously the behaviour I changed was the test for the task being idle.
> > > > > I am not sure how best to short-circuit that test from happening during
> > > > > boot as I am not familiar with the RCU code.
> > > >
> > > > The usual test for RCU's notion of early boot being completed is
> > > > (rcu_scheduler_active != RCU_SCHEDULER_INIT).
> > > >
> > > > Except that "ofl" should always be false that early in boot, at least
> > > > in mainline.
> > >
> > > Is this still true in the final version of the patch where we set the
> > > boot task as !idle until just before the early boot is finished? I
> > > wouldn't think of this as 'early in boot' anymore as much as the entire
> > > kernel setup. Maybe we need to shorten the time we stay in !idle mode
> > > for earlier kernels?
> >
> > In mainline, the ofl variable is defined as cpu_is_offline(cpu), and
> > during boot, the boot CPU is guaranteed to be online. (As opposed to
> > the boot CPU's idle-task state.)
> >
> > > How frequent is this function called? We could check something for
> > > early boot... or track down where the cpu is put online and restore idle
> > > before that happens?
> >
> > Once per RCU Tasks Trace grace period per reader seen to be blocking
> > that grace period. Its performance is as issue, but not to anywhere
> > near the same extent as (say) rcu_read_lock_trace().
> >
> > > > > It's also worth noting that the bug this fixes wasn't exposed until the
> > > > > maple tree (added in v6.1) was used for the IRQ descriptors (added in
> > > > > v6.5).
> > > >
> > > > Lots of latent bugs, to be sure, even with rcutorture. :-/
> > >
> > > The Right Thing is to fix the bug all the way back to the introduction,
> > > but what fallout makes the backport less desirable than living with the
> > > unexposed bug?
> >
> > You are quite right that it is possible for the risk of a backport to
> > exceed the risk of the original bug.
> >
> > I defer to Joel (CCed) on how best to resolve this in -stable.
>
> Maybe I am missing something but this issue should also be happening
> in mainline right?
>
> Even though mainline has 897ba84dc5aa ("rcu-tasks: Handle idle tasks
> for recently offlined CPUs") , the warning should still be happening
> due to Liam's "kernel/sched: Modify initial boot task idle setup"
> because the warning is just rearranged a bit but essentially the same.
>
> IMHO, the right thing to do then is to drop Liam's patch from 5.15 and
> fix it in mainline (using the ideas described in this thread), then
> backport both that new fix and Liam's patch to 5.15.
>
> Or is there a reason this warning does not show up on the mainline?
>
> My impression is that dropping Liam's patch for the stable release and
> revisiting it later is a better approach since tiny RCU is used way
> less in the wild than tree/tasks RCU. Thoughts?

I think that this one is strange enough that we need to write down the
situation in detail, make sure we have all the corner cases covered in
both mainline and -stable, and decide what to do from there.

Yes, I know, this email thread contains much of this information, but
a little organizing of it would be good.

Would you like to put that together, or should I? If me, I will get
a draft out by the end of this coming Tuesday, Pacific Time.

Thanx, Paul

2023-10-11 01:34:53

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH 5.15 000/183] 5.15.134-rc1 review

On Sun, Oct 08, 2023 at 06:20:53PM -0700, Paul E. McKenney wrote:
> On Sat, Oct 07, 2023 at 09:22:55PM -0400, Joel Fernandes wrote:
> > On Fri, Oct 6, 2023 at 2:20 PM Paul E. McKenney <[email protected]> wrote:
> > >
> > > On Fri, Oct 06, 2023 at 01:57:14PM -0400, Liam R. Howlett wrote:
> > > > * Paul E. McKenney <[email protected]> [231006 12:47]:
> > > > > On Fri, Oct 06, 2023 at 12:20:38PM -0400, Liam R. Howlett wrote:
> > > > > > * Naresh Kamboju <[email protected]> [231005 13:49]:
> > > > > > > On Wed, 4 Oct 2023 at 23:33, Greg Kroah-Hartman
> > > > > > > <[email protected]> wrote:
> > > > > > > >
> > > > > > > > This is the start of the stable review cycle for the 5.15.134 release.
> > > > > > > > There are 183 patches in this series, all will be posted as a response
> > > > > > > > to this one. If anyone has any issues with these being applied, please
> > > > > > > > let me know.
> > > > > > > >
> > > > > > > > Responses should be made by Fri, 06 Oct 2023 17:51:12 +0000.
> > > > > > > > Anything received after that time might be too late.
> > > > > > > >
> > > > > > > > The whole patch series can be found in one patch at:
> > > > > > > > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.134-rc1.gz
> > > > > > > > or in the git tree and branch at:
> > > > > > > > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
> > > > > > > > and the diffstat can be found below.
> > > > > > > >
> > > > > > > > thanks,
> > > > > > > >
> > > > > > > > greg k-h
> > > > > > >
> > > > > > > Results from Linaro’s test farm.
> > > > > > > Regressions on x86.
> > > > > > >
> > > > > > > Following kernel warning noticed on x86 while booting stable-rc 5.15.134-rc1
> > > > > > > with selftest merge config built kernel.
> > > > > > >
> > > > > > > Reported-by: Linux Kernel Functional Testing <[email protected]>
> > > > > > >
> > > > > > > Anyone noticed this kernel warning ?
> > > > > > >
> > > > > > > This is always reproducible while booting x86 with a given config.
> > > > > >
> > > > > > >From that config:
> > > > > > #
> > > > > > # RCU Subsystem
> > > > > > #
> > > > > > CONFIG_TREE_RCU=y
> > > > > > # CONFIG_RCU_EXPERT is not set
> > > > > > CONFIG_SRCU=y
> > > > > > CONFIG_TREE_SRCU=y
> > > > > > CONFIG_TASKS_RCU_GENERIC=y
> > > > > > CONFIG_TASKS_RUDE_RCU=y
> > > > > > CONFIG_TASKS_TRACE_RCU=y
> > > > > > CONFIG_RCU_STALL_COMMON=y
> > > > > > CONFIG_RCU_NEED_SEGCBLIST=y
> > > > > > # end of RCU Subsystem
> > > > > >
> > > > > > #
> > > > > > # RCU Debugging
> > > > > > #
> > > > > > CONFIG_PROVE_RCU=y
> > > > > > # CONFIG_RCU_SCALE_TEST is not set
> > > > > > # CONFIG_RCU_TORTURE_TEST is not set
> > > > > > # CONFIG_RCU_REF_SCALE_TEST is not set
> > > > > > CONFIG_RCU_CPU_STALL_TIMEOUT=21
> > > > > > CONFIG_RCU_TRACE=y
> > > > > > # CONFIG_RCU_EQS_DEBUG is not set
> > > > > > # end of RCU Debugging
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > x86 boot log:
> > > > > > > -----
> > > > > > > [ 0.000000] Linux version 5.15.134-rc1 (tuxmake@tuxmake)
> > > > > > > (x86_64-linux-gnu-gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils
> > > > > > > for Debian) 2.40) #1 SMP @1696443178
> > > > > > > ...
> > > > > > > [ 1.480701] ------------[ cut here ]------------
> > > > > > > [ 1.481296] WARNING: CPU: 0 PID: 13 at kernel/rcu/tasks.h:958
> > > > > > > trc_inspect_reader+0x80/0xb0
> > > > > > > [ 1.481296] Modules linked in:
> > > > > > > [ 1.481296] CPU: 0 PID: 13 Comm: rcu_tasks_trace Not tainted 5.15.134-rc1 #1
> > > > > > > [ 1.481296] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
> > > > > > > 2.5 11/26/2020
> > > > > > > [ 1.481296] RIP: 0010:trc_inspect_reader+0x80/0xb0
> > > > > >
> > > > > > This function has changed a lot, including the dropping of this
> > > > > > WARN_ON_ONCE(). The warning was replaced in 897ba84dc5aa ("rcu-tasks:
> > > > > > Handle idle tasks for recently offlined CPUs") with something that looks
> > > > > > equivalent so I'm not sure why it would not trigger in newer revisions.
> > > > > >
> > > > > > Obviously the behaviour I changed was the test for the task being idle.
> > > > > > I am not sure how best to short-circuit that test from happening during
> > > > > > boot as I am not familiar with the RCU code.
> > > > >
> > > > > The usual test for RCU's notion of early boot being completed is
> > > > > (rcu_scheduler_active != RCU_SCHEDULER_INIT).
> > > > >
> > > > > Except that "ofl" should always be false that early in boot, at least
> > > > > in mainline.
> > > >
> > > > Is this still true in the final version of the patch where we set the
> > > > boot task as !idle until just before the early boot is finished? I
> > > > wouldn't think of this as 'early in boot' anymore as much as the entire
> > > > kernel setup. Maybe we need to shorten the time we stay in !idle mode
> > > > for earlier kernels?
> > >
> > > In mainline, the ofl variable is defined as cpu_is_offline(cpu), and
> > > during boot, the boot CPU is guaranteed to be online. (As opposed to
> > > the boot CPU's idle-task state.)
> > >
> > > > How frequent is this function called? We could check something for
> > > > early boot... or track down where the cpu is put online and restore idle
> > > > before that happens?
> > >
> > > Once per RCU Tasks Trace grace period per reader seen to be blocking
> > > that grace period. Its performance is as issue, but not to anywhere
> > > near the same extent as (say) rcu_read_lock_trace().
> > >
> > > > > > It's also worth noting that the bug this fixes wasn't exposed until the
> > > > > > maple tree (added in v6.1) was used for the IRQ descriptors (added in
> > > > > > v6.5).
> > > > >
> > > > > Lots of latent bugs, to be sure, even with rcutorture. :-/
> > > >
> > > > The Right Thing is to fix the bug all the way back to the introduction,
> > > > but what fallout makes the backport less desirable than living with the
> > > > unexposed bug?
> > >
> > > You are quite right that it is possible for the risk of a backport to
> > > exceed the risk of the original bug.
> > >
> > > I defer to Joel (CCed) on how best to resolve this in -stable.
> >
> > Maybe I am missing something but this issue should also be happening
> > in mainline right?
> >
> > Even though mainline has 897ba84dc5aa ("rcu-tasks: Handle idle tasks
> > for recently offlined CPUs") , the warning should still be happening
> > due to Liam's "kernel/sched: Modify initial boot task idle setup"
> > because the warning is just rearranged a bit but essentially the same.
> >
> > IMHO, the right thing to do then is to drop Liam's patch from 5.15 and
> > fix it in mainline (using the ideas described in this thread), then
> > backport both that new fix and Liam's patch to 5.15.
> >
> > Or is there a reason this warning does not show up on the mainline?

There is not a whole lot of commonality between the v5.15.134 version of
RCU Tasks Trace and that of mainline. In theory, in mainline, CPU hotplug
is supposed to be disabled across all calls to trc_inspect_reader(),
which means that there would not be any CPU coming or going.

But there could potentially be some time between when a CPU was
marked as online and its idle task was marked PF_IDLE. And in
fact x86 start_secondary() invokes set_cpu_online() before it calls
cpu_startup_entry(), and it is the latter than sets PF_IDLE.

The same is true of alpha, arc, arm, arm64, csky, ia64, loongarch, mips,
openrisc, parisc, powerpc, riscv, s390, sh, sparc32, sparc64, x86 xen,
and xtensa, which is everybody.

One reason why my testing did not reproduce this is because I was running
against v6.6-rc1, and cff9b2332ab7 ("kernel/sched: Modify initial boot
task idle setup") went into v6.6-rc3. An initial run merging in current
mainline also failed to reproduce this, but I am running overnight.
If that doesn't reproduce, I will try inserting delays between the
set_cpu_online() and the cpu_startup_entry().

If this problem is real, fixes include:

o Revert Liam's patch and make Tiny RCU's call_rcu() deal with
the problem. This is overhead and non-tinyness, but to Joel's
point, it might be best.

o Go back to something more like Liam's original patch, which
cleared PF_IDLE only for the boot CPU.

o Set PF_IDLE before calling set_cpu_online(). This would work,
but it would also be rather ugly, reaching into each and every
architecture.

o Move the call to set_cpu_online() into cpu_startup_entry().
This would require some serious inspection to prove that it is
safe, assuming that it is in fact safe.

o Drop the WARN_ON_ONCE() from trc_inspect_reader(). Not all
that excited by losing this diagnostic, but then again it
has been awhile since it has caught anything.

o Make the WARN_ON_ONCE() condition in trc_inspect_reader() instead
to a "return false" to retry later. Ditto, also not liking the
possibility of indefinite deferral with no warning.

There are likely other approaches.

> > My impression is that dropping Liam's patch for the stable release and
> > revisiting it later is a better approach since tiny RCU is used way
> > less in the wild than tree/tasks RCU. Thoughts?
>
> I think that this one is strange enough that we need to write down the
> situation in detail, make sure we have all the corner cases covered in
> both mainline and -stable, and decide what to do from there.
>
> Yes, I know, this email thread contains much of this information, but
> a little organizing of it would be good.
>
> Would you like to put that together, or should I? If me, I will get
> a draft out by the end of this coming Tuesday, Pacific Time.

And I guess that this is that draft.

It is quite possible that Tasks RCU also has issues with momentary
online non-idleness of non-boot-CPU idle tasks, but checking that is a
task for another time.

Thanx, Paul

2023-10-11 02:44:40

by Joel Fernandes

[permalink] [raw]
Subject: Re: [PATCH 5.15 000/183] 5.15.134-rc1 review

On Sun, Oct 8, 2023 at 9:20 PM Paul E. McKenney <[email protected]> wrote:
[...]
> > > > How frequent is this function called? We could check something for
> > > > early boot... or track down where the cpu is put online and restore idle
> > > > before that happens?
> > >
> > > Once per RCU Tasks Trace grace period per reader seen to be blocking
> > > that grace period. Its performance is as issue, but not to anywhere
> > > near the same extent as (say) rcu_read_lock_trace().
> > >
> > > > > > It's also worth noting that the bug this fixes wasn't exposed until the
> > > > > > maple tree (added in v6.1) was used for the IRQ descriptors (added in
> > > > > > v6.5).
> > > > >
> > > > > Lots of latent bugs, to be sure, even with rcutorture. :-/
> > > >
> > > > The Right Thing is to fix the bug all the way back to the introduction,
> > > > but what fallout makes the backport less desirable than living with the
> > > > unexposed bug?
> > >
> > > You are quite right that it is possible for the risk of a backport to
> > > exceed the risk of the original bug.
> > >
> > > I defer to Joel (CCed) on how best to resolve this in -stable.
> >
> > Maybe I am missing something but this issue should also be happening
> > in mainline right?
> >
> > Even though mainline has 897ba84dc5aa ("rcu-tasks: Handle idle tasks
> > for recently offlined CPUs") , the warning should still be happening
> > due to Liam's "kernel/sched: Modify initial boot task idle setup"
> > because the warning is just rearranged a bit but essentially the same.
> >
> > IMHO, the right thing to do then is to drop Liam's patch from 5.15 and
> > fix it in mainline (using the ideas described in this thread), then
> > backport both that new fix and Liam's patch to 5.15.
> >
> > Or is there a reason this warning does not show up on the mainline?
> >
> > My impression is that dropping Liam's patch for the stable release and
> > revisiting it later is a better approach since tiny RCU is used way
> > less in the wild than tree/tasks RCU. Thoughts?
>
> I think that this one is strange enough that we need to write down the
> situation in detail, make sure we have all the corner cases covered in
> both mainline and -stable, and decide what to do from there.
>
> Yes, I know, this email thread contains much of this information, but
> a little organizing of it would be good.
>
> Would you like to put that together, or should I? If me, I will get
> a draft out by the end of this coming Tuesday, Pacific Time.

I apologize, I haven't been able to do any real work as I was OOO for
the most part due to dental issues. I am about 25% back now. I will
review your other email writeup and thanks for putting it together!

- Joel

2023-10-11 03:12:08

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH 5.15 000/183] 5.15.134-rc1 review

On Tue, Oct 10, 2023 at 10:44:16PM -0400, Joel Fernandes wrote:
> On Sun, Oct 8, 2023 at 9:20 PM Paul E. McKenney <[email protected]> wrote:
> [...]
> > > > > How frequent is this function called? We could check something for
> > > > > early boot... or track down where the cpu is put online and restore idle
> > > > > before that happens?
> > > >
> > > > Once per RCU Tasks Trace grace period per reader seen to be blocking
> > > > that grace period. Its performance is as issue, but not to anywhere
> > > > near the same extent as (say) rcu_read_lock_trace().
> > > >
> > > > > > > It's also worth noting that the bug this fixes wasn't exposed until the
> > > > > > > maple tree (added in v6.1) was used for the IRQ descriptors (added in
> > > > > > > v6.5).
> > > > > >
> > > > > > Lots of latent bugs, to be sure, even with rcutorture. :-/
> > > > >
> > > > > The Right Thing is to fix the bug all the way back to the introduction,
> > > > > but what fallout makes the backport less desirable than living with the
> > > > > unexposed bug?
> > > >
> > > > You are quite right that it is possible for the risk of a backport to
> > > > exceed the risk of the original bug.
> > > >
> > > > I defer to Joel (CCed) on how best to resolve this in -stable.
> > >
> > > Maybe I am missing something but this issue should also be happening
> > > in mainline right?
> > >
> > > Even though mainline has 897ba84dc5aa ("rcu-tasks: Handle idle tasks
> > > for recently offlined CPUs") , the warning should still be happening
> > > due to Liam's "kernel/sched: Modify initial boot task idle setup"
> > > because the warning is just rearranged a bit but essentially the same.
> > >
> > > IMHO, the right thing to do then is to drop Liam's patch from 5.15 and
> > > fix it in mainline (using the ideas described in this thread), then
> > > backport both that new fix and Liam's patch to 5.15.
> > >
> > > Or is there a reason this warning does not show up on the mainline?
> > >
> > > My impression is that dropping Liam's patch for the stable release and
> > > revisiting it later is a better approach since tiny RCU is used way
> > > less in the wild than tree/tasks RCU. Thoughts?
> >
> > I think that this one is strange enough that we need to write down the
> > situation in detail, make sure we have all the corner cases covered in
> > both mainline and -stable, and decide what to do from there.
> >
> > Yes, I know, this email thread contains much of this information, but
> > a little organizing of it would be good.
> >
> > Would you like to put that together, or should I? If me, I will get
> > a draft out by the end of this coming Tuesday, Pacific Time.
>
> I apologize, I haven't been able to do any real work as I was OOO for
> the most part due to dental issues. I am about 25% back now. I will
> review your other email writeup and thanks for putting it together!

No need to apologize! If anything, it is I who should apologize for
not digging deeply into this to begin with. As always, there were
distraction. ;-)

Thanx, Paul

2023-10-11 05:05:25

by Joel Fernandes

[permalink] [raw]
Subject: Re: [PATCH 5.15 000/183] 5.15.134-rc1 review

On Tue, Oct 10, 2023 at 06:34:35PM -0700, Paul E. McKenney wrote:
[...]
> > > > > > > It's also worth noting that the bug this fixes wasn't exposed until the
> > > > > > > maple tree (added in v6.1) was used for the IRQ descriptors (added in
> > > > > > > v6.5).
> > > > > >
> > > > > > Lots of latent bugs, to be sure, even with rcutorture. :-/
> > > > >
> > > > > The Right Thing is to fix the bug all the way back to the introduction,
> > > > > but what fallout makes the backport less desirable than living with the
> > > > > unexposed bug?
> > > >
> > > > You are quite right that it is possible for the risk of a backport to
> > > > exceed the risk of the original bug.
> > > >
> > > > I defer to Joel (CCed) on how best to resolve this in -stable.
> > >
> > > Maybe I am missing something but this issue should also be happening
> > > in mainline right?
> > >
> > > Even though mainline has 897ba84dc5aa ("rcu-tasks: Handle idle tasks
> > > for recently offlined CPUs") , the warning should still be happening
> > > due to Liam's "kernel/sched: Modify initial boot task idle setup"
> > > because the warning is just rearranged a bit but essentially the same.
> > >
> > > IMHO, the right thing to do then is to drop Liam's patch from 5.15 and
> > > fix it in mainline (using the ideas described in this thread), then
> > > backport both that new fix and Liam's patch to 5.15.
> > >
> > > Or is there a reason this warning does not show up on the mainline?
>
> There is not a whole lot of commonality between the v5.15.134 version of
> RCU Tasks Trace and that of mainline. In theory, in mainline, CPU hotplug
> is supposed to be disabled across all calls to trc_inspect_reader(),
> which means that there would not be any CPU coming or going.
>
> But there could potentially be some time between when a CPU was
> marked as online and its idle task was marked PF_IDLE. And in
> fact x86 start_secondary() invokes set_cpu_online() before it calls
> cpu_startup_entry(), and it is the latter than sets PF_IDLE.
>
> The same is true of alpha, arc, arm, arm64, csky, ia64, loongarch, mips,
> openrisc, parisc, powerpc, riscv, s390, sh, sparc32, sparc64, x86 xen,
> and xtensa, which is everybody.
>
> One reason why my testing did not reproduce this is because I was running
> against v6.6-rc1, and cff9b2332ab7 ("kernel/sched: Modify initial boot
> task idle setup") went into v6.6-rc3. An initial run merging in current
> mainline also failed to reproduce this, but I am running overnight.
> If that doesn't reproduce, I will try inserting delays between the
> set_cpu_online() and the cpu_startup_entry().

I thought the warning happens before set_cpu_online() is even called, because
under such situation, ofl == true and the task is not set to PF_IDLE yet:

WARN_ON_ONCE(ofl && task_curr(t) && !is_idle_task(t));

> If this problem is real, fixes include:
>
> o Revert Liam's patch and make Tiny RCU's call_rcu() deal with
> the problem. This is overhead and non-tinyness, but to Joel's
> point, it might be best.
>
> o Go back to something more like Liam's original patch, which
> cleared PF_IDLE only for the boot CPU.
>
> o Set PF_IDLE before calling set_cpu_online(). This would work,
> but it would also be rather ugly, reaching into each and every
> architecture.
>
> o Move the call to set_cpu_online() into cpu_startup_entry().
> This would require some serious inspection to prove that it is
> safe, assuming that it is in fact safe.
>
> o Drop the WARN_ON_ONCE() from trc_inspect_reader(). Not all
> that excited by losing this diagnostic, but then again it
> has been awhile since it has caught anything.
>
> o Make the WARN_ON_ONCE() condition in trc_inspect_reader() instead
> to a "return false" to retry later. Ditto, also not liking the
> possibility of indefinite deferral with no warning.

Just for completeness,

o Since it just a warning, checking for task_struct::pid == 0 instead of is_idle_task()?
Though PF_IDLE is also set in play_idle_precise().

o Change warning to:
WARN_ON_ONCE(ofl && task_curr(t) && (!is_idle_task(t) && t->pid != 0));

thanks,

- Joel

2023-10-11 10:25:19

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH 5.15 000/183] 5.15.134-rc1 review

On Wed, Oct 11, 2023 at 05:05:04AM +0000, Joel Fernandes wrote:
> On Tue, Oct 10, 2023 at 06:34:35PM -0700, Paul E. McKenney wrote:
> [...]
> > > > > > > > It's also worth noting that the bug this fixes wasn't exposed until the
> > > > > > > > maple tree (added in v6.1) was used for the IRQ descriptors (added in
> > > > > > > > v6.5).
> > > > > > >
> > > > > > > Lots of latent bugs, to be sure, even with rcutorture. :-/
> > > > > >
> > > > > > The Right Thing is to fix the bug all the way back to the introduction,
> > > > > > but what fallout makes the backport less desirable than living with the
> > > > > > unexposed bug?
> > > > >
> > > > > You are quite right that it is possible for the risk of a backport to
> > > > > exceed the risk of the original bug.
> > > > >
> > > > > I defer to Joel (CCed) on how best to resolve this in -stable.
> > > >
> > > > Maybe I am missing something but this issue should also be happening
> > > > in mainline right?
> > > >
> > > > Even though mainline has 897ba84dc5aa ("rcu-tasks: Handle idle tasks
> > > > for recently offlined CPUs") , the warning should still be happening
> > > > due to Liam's "kernel/sched: Modify initial boot task idle setup"
> > > > because the warning is just rearranged a bit but essentially the same.
> > > >
> > > > IMHO, the right thing to do then is to drop Liam's patch from 5.15 and
> > > > fix it in mainline (using the ideas described in this thread), then
> > > > backport both that new fix and Liam's patch to 5.15.
> > > >
> > > > Or is there a reason this warning does not show up on the mainline?
> >
> > There is not a whole lot of commonality between the v5.15.134 version of
> > RCU Tasks Trace and that of mainline. In theory, in mainline, CPU hotplug
> > is supposed to be disabled across all calls to trc_inspect_reader(),
> > which means that there would not be any CPU coming or going.
> >
> > But there could potentially be some time between when a CPU was
> > marked as online and its idle task was marked PF_IDLE. And in
> > fact x86 start_secondary() invokes set_cpu_online() before it calls
> > cpu_startup_entry(), and it is the latter than sets PF_IDLE.
> >
> > The same is true of alpha, arc, arm, arm64, csky, ia64, loongarch, mips,
> > openrisc, parisc, powerpc, riscv, s390, sh, sparc32, sparc64, x86 xen,
> > and xtensa, which is everybody.
> >
> > One reason why my testing did not reproduce this is because I was running
> > against v6.6-rc1, and cff9b2332ab7 ("kernel/sched: Modify initial boot
> > task idle setup") went into v6.6-rc3. An initial run merging in current
> > mainline also failed to reproduce this, but I am running overnight.
> > If that doesn't reproduce, I will try inserting delays between the
> > set_cpu_online() and the cpu_startup_entry().
>
> I thought the warning happens before set_cpu_online() is even called, because
> under such situation, ofl == true and the task is not set to PF_IDLE yet:
>
> WARN_ON_ONCE(ofl && task_curr(t) && !is_idle_task(t));

That case is supposed to be excluded by the cpus_read_lock() calls.
Yes, key phrase "supposed to be". ;-)

> > If this problem is real, fixes include:
> >
> > o Revert Liam's patch and make Tiny RCU's call_rcu() deal with
> > the problem. This is overhead and non-tinyness, but to Joel's
> > point, it might be best.
> >
> > o Go back to something more like Liam's original patch, which
> > cleared PF_IDLE only for the boot CPU.
> >
> > o Set PF_IDLE before calling set_cpu_online(). This would work,
> > but it would also be rather ugly, reaching into each and every
> > architecture.
> >
> > o Move the call to set_cpu_online() into cpu_startup_entry().
> > This would require some serious inspection to prove that it is
> > safe, assuming that it is in fact safe.
> >
> > o Drop the WARN_ON_ONCE() from trc_inspect_reader(). Not all
> > that excited by losing this diagnostic, but then again it
> > has been awhile since it has caught anything.
> >
> > o Make the WARN_ON_ONCE() condition in trc_inspect_reader() instead
> > to a "return false" to retry later. Ditto, also not liking the
> > possibility of indefinite deferral with no warning.
>
> Just for completeness,
>
> o Since it just a warning, checking for task_struct::pid == 0 instead of is_idle_task()?
> Though PF_IDLE is also set in play_idle_precise().
>
> o Change warning to:
> WARN_ON_ONCE(ofl && task_curr(t) && (!is_idle_task(t) && t->pid != 0));

This change does look promising, thank you!

Thanx, Paul

2023-10-11 13:47:41

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH 5.15 000/183] 5.15.134-rc1 review

Le Tue, Oct 10, 2023 at 06:34:35PM -0700, Paul E. McKenney a ?crit :
> If this problem is real, fixes include:
>
> o Revert Liam's patch and make Tiny RCU's call_rcu() deal with
> the problem. This is overhead and non-tinyness, but to Joel's
> point, it might be best.

But what is calling call_rcu() or start_poll_synchronize_rcu() so
early that the CPU is not even online? (that's before boot_cpu_init() !)

Deferring PF_IDLE setting might pave the way for more issues like this one,
present or future. Though is_idle_task() returning true when the task is not
in the idle loop but is playing the init/0 role is debatable.

An alternative for tiny RCU is to force waking up ksoftirqd when call_rcu()
is in the idle task. Since rcu_qs() during the context switch raises a softirq
anyway. It's more overhead for start_poll_synchronize_rcu() though but do we
expect much RCU polling in idle?

diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index a92bce40b04b..6ab15233e2be 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -604,6 +604,7 @@ extern void __raise_softirq_irqoff(unsigned int nr);

extern void raise_softirq_irqoff(unsigned int nr);
extern void raise_softirq(unsigned int nr);
+extern void raise_ksoftirqd_irqsoff(unsigned int nr);

DECLARE_PER_CPU(struct task_struct *, ksoftirqd);

diff --git a/kernel/rcu/tiny.c b/kernel/rcu/tiny.c
index 42f7589e51e0..872dab8b8b53 100644
--- a/kernel/rcu/tiny.c
+++ b/kernel/rcu/tiny.c
@@ -189,12 +189,12 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func)
local_irq_save(flags);
*rcu_ctrlblk.curtail = head;
rcu_ctrlblk.curtail = &head->next;
- local_irq_restore(flags);

if (unlikely(is_idle_task(current))) {
/* force scheduling for rcu_qs() */
- resched_cpu(0);
+ raise_ksoftirqd_irqsoff(RCU_SOFTIRQ);
}
+ local_irq_restore(flags);
}
EXPORT_SYMBOL_GPL(call_rcu);

@@ -225,10 +225,13 @@ EXPORT_SYMBOL_GPL(get_state_synchronize_rcu);
unsigned long start_poll_synchronize_rcu(void)
{
unsigned long gp_seq = get_state_synchronize_rcu();
+ unsigned long flags;

if (unlikely(is_idle_task(current))) {
+ local_irq_save(flags);
/* force scheduling for rcu_qs() */
- resched_cpu(0);
+ raise_ksoftirqd_irqsoff(RCU_SOFTIRQ);
+ local_irq_restore(flags);
}
return gp_seq;
}
diff --git a/kernel/softirq.c b/kernel/softirq.c
index 210cf5f8d92c..ef105cbdc705 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -695,6 +695,14 @@ void __raise_softirq_irqoff(unsigned int nr)
or_softirq_pending(1UL << nr);
}

+#ifdef CONFIG_RCU_TINY
+void raise_ksoftirqd(unsigned int nr)
+{
+ __raise_softirq_irqoff(nr);
+ wakeup_softirqd();
+}
+#endif
+
void open_softirq(int nr, void (*action)(struct softirq_action *))
{
softirq_vec[nr].action = action;




2023-10-11 15:59:58

by Joel Fernandes

[permalink] [raw]
Subject: Re: [PATCH 5.15 000/183] 5.15.134-rc1 review

Hello Greg,

On Sat, Oct 7, 2023 at 9:00 PM Greg Kroah-Hartman
<[email protected]> wrote:
>
> This is the start of the stable review cycle for the 5.15.134 release.
> There are 183 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Fri, 06 Oct 2023 17:51:12 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.134-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
> and the diffstat can be found below.
[...]
> Liam R. Howlett <[email protected]>
> kernel/sched: Modify initial boot task idle setup
>

Let us drop this patch because it caused new tasks-RCU warnings (both
normal and rude tasks RCU) in my stable test rig. We are discussing
the "right fix" and at that time a backport can be done.

Hope Liam is also Ok with that. I am happy to do that future backport if needed.

Thanks,

- Joel


> Heiner Kallweit <[email protected]>
> i2c: i801: unregister tco_pdev in i801_probe() error path
>
> Niklas Cassel <[email protected]>
> ata: libata-scsi: ignore reserved bits for REPORT SUPPORTED OPERATION CODES
>
> Kailang Yang <[email protected]>
> ALSA: hda: Disable power save for solving pop issue on Lenovo ThinkCentre M70q
>
> Pablo Neira Ayuso <[email protected]>
> netfilter: nf_tables: disallow rule removal from chain binding
>
> Pan Bian <[email protected]>
> nilfs2: fix potential use after free in nilfs_gccache_submit_read_data()
>
> Andy Shevchenko <[email protected]>
> serial: 8250_port: Check IRQ data before use
>
> Daniel Starke <[email protected]>
> Revert "tty: n_gsm: fix UAF in gsm_cleanup_mux"
>
> Ricky WU <[email protected]>
> misc: rtsx: Fix some platforms can not boot and move the l1ss judgment to probe
>
> Pu Wen <[email protected]>
> x86/srso: Add SRSO mitigation for Hygon processors
>
> Nicolin Chen <[email protected]>
> iommu/arm-smmu-v3: Fix soft lockup triggered by arm_smmu_mm_invalidate_range
>
> Vishal Goel <[email protected]>
> Smack:- Use overlay inode label in smack_inode_copy_up()
>
> Roberto Sassu <[email protected]>
> smack: Retrieve transmuting information in smack_inode_getsecurity()
>
> Roberto Sassu <[email protected]>
> smack: Record transmuting in smk_transmuted
>
> Irvin Cote <[email protected]>
> nvme-pci: always return an ERR_PTR from nvme_pci_alloc_dev
>
> Gleb Chesnokov <[email protected]>
> scsi: qla2xxx: Fix NULL pointer dereference in target mode
>
> Ian Rogers <[email protected]>
> perf metric: Return early if no CPU PMU table exists
>
> Andrii Staikov <[email protected]>
> i40e: fix potential NULL pointer dereferencing of pf->vf i40e_sync_vsi_filters()
>
> Mika Westerberg <[email protected]>
> watchdog: iTCO_wdt: Set NO_REBOOT if the watchdog is not already running
>
> Mika Westerberg <[email protected]>
> watchdog: iTCO_wdt: No need to stop the timer in probe
>
> Pratyush Yadav <[email protected]>
> nvme-pci: do not set the NUMA node of device if it has none
>
> Christoph Hellwig <[email protected]>
> nvme-pci: factor out a nvme_pci_alloc_dev helper
>
> Christoph Hellwig <[email protected]>
> nvme-pci: factor the iod mempool creation into a helper
>
> Chengming Zhou <[email protected]>
> cgroup: Fix suspicious rcu_dereference_check() usage warning
>
> Chengming Zhou <[email protected]>
> sched/cpuacct: Optimize away RCU read lock
>
> Arnaldo Carvalho de Melo <[email protected]>
> perf build: Define YYNOMEM as YYNOABORT for bison < 3.81
>
> Thomas Zimmermann <[email protected]>
> fbdev/sh7760fb: Depend on FB=y
>
> Johnathan Mantey <[email protected]>
> ncsi: Propagate carrier gain/loss events to the NCSI controller
>
> Benjamin Gray <[email protected]>
> powerpc/watchpoints: Annotate atomic context in more places
>
> Benjamin Gray <[email protected]>
> powerpc/watchpoint: Disable pagefaults when getting user instruction
>
> Benjamin Gray <[email protected]>
> powerpc/watchpoints: Disable preemption in thread_change_pc()
>
> Hans Verkuil <[email protected]>
> media: vb2: frame_vector.c: replace WARN_ONCE with a comment
>
> Chancel Liu <[email protected]>
> ASoC: imx-rpmsg: Set ignore_pmdown_time for dai_link
>
> Stanislav Fomichev <[email protected]>
> bpf: Clarify error expectations from bpf_clone_redirect
>
> Shengjiu Wang <[email protected]>
> ASoC: fsl: imx-pcm-rpmsg: Add SNDRV_PCM_INFO_BATCH flag
>
> Valentin Caron <[email protected]>
> spi: stm32: add a delay before SPI disable
>
> Han Xu <[email protected]>
> spi: nxp-fspi: reset the FLSHxCR1 registers
>
> Niklas Cassel <[email protected]>
> ata: libata-eh: do not clear ATA_PFLAG_EH_PENDING in ata_eh_reset()
>
> Steve French <[email protected]>
> smb3: correct places where ENOTSUPP is used instead of preferred EOPNOTSUPP
>
> Michal Grzedzicki <[email protected]>
> scsi: pm80xx: Avoid leaking tags when processing OPC_INB_SET_CONTROLLER_CONFIG command
>
> Michal Grzedzicki <[email protected]>
> scsi: pm80xx: Use phy-specific SAS address when sending PHY_START command
>
> David Francis <[email protected]>
> drm/amdgpu: Handle null atom context in VBIOS info ioctl
>
> Swapnil Patel <[email protected]>
> drm/amd/display: Don't check registers, if using AUX BL control
>
> David Thompson <[email protected]>
> platform/mellanox: mlxbf-bootctl: add NET dependency into Kconfig
>
> Steven Rostedt (Google) <[email protected]>
> ring-buffer: Do not attempt to read past "commit"
>
> Ricardo B. Marliere <[email protected]>
> selftests: fix dependency checker script
>
> Filipe Manana <[email protected]>
> btrfs: improve error message after failure to add delayed dir index item
>
> Zheng Yejian <[email protected]>
> ring-buffer: Avoid softlockup in ring_buffer_resize()
>
> Zheng Yejian <[email protected]>
> selftests/ftrace: Correctly enable event in instance-event.tc
>
> Kiwoong Kim <[email protected]>
> scsi: ufs: core: Move __ufshcd_send_uic_cmd() outside host_lock
>
> Javed Hasan <[email protected]>
> scsi: qedf: Add synchronization between I/O completions and abort
>
> Helge Deller <[email protected]>
> parisc: irq: Make irq_stack_union static to avoid sparse warning
>
> Helge Deller <[email protected]>
> parisc: drivers: Fix sparse warning
>
> Helge Deller <[email protected]>
> parisc: iosapic.c: Fix sparse warnings
>
> Helge Deller <[email protected]>
> parisc: sba: Fix compile warning wrt list of SBA devices
>
> Tobias Schramm <[email protected]>
> spi: sun6i: fix race between DMA RX transfer completion and RX FIFO drain
>
> Tobias Schramm <[email protected]>
> spi: sun6i: reduce DMA RX transfer width to single byte
>
> Sergey Senozhatsky <[email protected]>
> dma-debug: don't call __dma_entry_alloc_check_leak() under free_entries_lock
>
> William A. Kennington III <[email protected]>
> i2c: npcm7xx: Fix callback completion ordering
>
> Wenhua Lin <[email protected]>
> gpio: pmic-eic-sprd: Add can_sleep flag for PMIC EIC chip
>
> Nathan Rossi <[email protected]>
> soc: imx8m: Enable OCOTP clock for imx8mm before reading registers
>
> Max Filippov <[email protected]>
> xtensa: boot/lib: fix function prototypes
>
> Randy Dunlap <[email protected]>
> xtensa: boot: don't add include-dirs
>
> Randy Dunlap <[email protected]>
> xtensa: iss/network: make functions static
>
> Max Filippov <[email protected]>
> xtensa: add default definition for XCHAL_HAVE_DIV32
>
> Christophe JAILLET <[email protected]>
> firmware: imx-dsp: Fix an error handling path in imx_dsp_setup_channels()
>
> Dan Carpenter <[email protected]>
> power: supply: ucs1002: fix error code in ucs1002_get_property()
>
> Tony Lindgren <[email protected]>
> bus: ti-sysc: Fix SYSC_QUIRK_SWSUP_SIDLE_ACT handling for uart wake-up
>
> Tony Lindgren <[email protected]>
> ARM: dts: ti: omap: motorola-mapphone: Fix abe_clkctrl warning on boot
>
> Tony Lindgren <[email protected]>
> ARM: dts: ti: omap: Fix bandgap thermal cells addressing for omap3/4
>
> Krzysztof Kozlowski <[email protected]>
> ARM: dts: omap: correct indentation
>
> Thomas Gleixner <[email protected]>
> treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_56.RULE (part 1)
>
> Timo Alho <[email protected]>
> clk: tegra: fix error return case for recalc_rate
>
> Adam Ford <[email protected]>
> bus: ti-sysc: Fix missing AM35xx SoC matching
>
> Julien Panis <[email protected]>
> bus: ti-sysc: Use fsleep() instead of usleep_range() in sysc_reset()
>
> Marek Vasut <[email protected]>
> drm/bridge: ti-sn65dsi83: Do not generate HFP/HBP/HSA and EOT packet
>
> Christoph Hellwig <[email protected]>
> MIPS: Alchemy: only build mmc support helpers if au1xmmc is enabled
>
> Qu Wenruo <[email protected]>
> btrfs: reset destination buffer when read_extent_buffer() gets invalid range
>
> Nilesh Javali <[email protected]>
> scsi: qla2xxx: Use raw_smp_processor_id() instead of smp_processor_id()
>
> Shreyas Deodhar <[email protected]>
> scsi: qla2xxx: Select qpair depending on which CPU post_cmd() gets called
>
> Werner Fischer <[email protected]>
> ata: ahci: Add Elkhart Lake AHCI controller
>
> Mario Limonciello <[email protected]>
> ata: ahci: Rename board_ahci_mobile
>
> Paul Menzel <[email protected]>
> ata: ahci: Add support for AMD A85 FCH (Hudson D4)
>
> Paul Menzel <[email protected]>
> ata: libata: Rename link flag ATA_LFLAG_NO_DB_DELAY
>
> Xiao Liang <[email protected]>
> netfilter: nft_exthdr: Fix non-linear header modification
>
> Florian Westphal <[email protected]>
> netfilter: exthdr: add support for tcp option removal
>
> Namhyung Kim <[email protected]>
> perf build: Update build rule for generated files
>
> Ian Rogers <[email protected]>
> perf jevents: Switch build to use jevents.py
>
> Werner Sembach <[email protected]>
> Input: i8042 - add quirk for TUXEDO Gemini 17 Gen1/Clevo PD70PN
>
> Huacai Chen <[email protected]>
> Input: i8042 - rename i8042-x86ia64io.h to i8042-acpipnpio.h
>
> Darrick J. Wong <[email protected]>
> xfs: fix xfs_inodegc_stop racing with mod_delayed_work
>
> Darrick J. Wong <[email protected]>
> xfs: disable reaping in fscounters scrub
>
> Darrick J. Wong <[email protected]>
> xfs: check that per-cpu inodegc workers actually run on that cpu
>
> Darrick J. Wong <[email protected]>
> xfs: explicitly specify cpu when forcing inodegc delayed work to run immediately
>
> Dave Chinner <[email protected]>
> xfs: introduce xfs_inodegc_push()
>
> Dave Chinner <[email protected]>
> xfs: bound maximum wait time for inodegc work
>
> Liang He <[email protected]>
> i2c: mux: gpio: Add missing fwnode_handle_put()
>
> Andy Shevchenko <[email protected]>
> i2c: mux: gpio: Replace custom acpi_get_local_address()
>
> Xiaoke Wang <[email protected]>
> i2c: mux: demux-pinctrl: check the return value of devm_kstrdup()
>
> Christophe JAILLET <[email protected]>
> gpio: tb10x: Fix an error handling path in tb10x_gpio_probe()
>
> Sasha Levin <[email protected]>
> Fix up backport of 136191703038 ("interconnect: Teach lockdep about icc_bw_lock order")
>
> Muhammad Husaini Zulkifli <[email protected]>
> igc: Expose tx-usecs coalesce setting to user
>
> Sebastian Andrzej Siewior <[email protected]>
> bnxt_en: Flush XDP for bnxt_poll_nitroa0()'s NAPI
>
> Sebastian Andrzej Siewior <[email protected]>
> net: ena: Flush XDP packets on error.
>
> Sebastian Andrzej Siewior <[email protected]>
> locking/seqlock: Do the lockdep annotation before locking in do_write_seqcount_begin_nested()
>
> Jozsef Kadlecsik <[email protected]>
> netfilter: ipset: Fix race between IPSET_CMD_CREATE and IPSET_CMD_SWAP
>
> Florian Westphal <[email protected]>
> netfilter: nf_tables: disable toggling dormant table state more than once
>
> Artem Chernyshev <[email protected]>
> net: rds: Fix possible NULL-pointer dereference
>
> Ziyang Xuan <[email protected]>
> team: fix null-ptr-deref when team device type is changed
>
> Eric Dumazet <[email protected]>
> net: bridge: use DEV_STATS_INC()
>
> Jie Wang <[email protected]>
> net: hns3: add 5ms delay before clear firmware reset irq source
>
> Jijie Shao <[email protected]>
> net: hns3: fix fail to delete tc flower rules during reset issue
>
> Jian Shen <[email protected]>
> net: hns3: only enable unicast promisc when mac table full
>
> Jie Wang <[email protected]>
> net: hns3: fix GRE checksum offload issue
>
> Josh Poimboeuf <[email protected]>
> x86/srso: Fix SBPB enablement for spec_rstack_overflow=off
>
> Josh Poimboeuf <[email protected]>
> x86/srso: Fix srso_show_state() side effect
>
> Stephen Boyd <[email protected]>
> platform/x86: intel_scu_ipc: Fail IPC send if still busy
>
> Stephen Boyd <[email protected]>
> platform/x86: intel_scu_ipc: Don't override scu in intel_scu_ipc_dev_simple_command()
>
> Stephen Boyd <[email protected]>
> platform/x86: intel_scu_ipc: Check status upon timeout in ipc_wait_for_interrupt()
>
> Stephen Boyd <[email protected]>
> platform/x86: intel_scu_ipc: Check status after timeout in busy_loop()
>
> Eric Dumazet <[email protected]>
> dccp: fix dccp_v4_err()/dccp_v6_err() again
>
> Kajol Jain <[email protected]>
> powerpc/perf/hv-24x7: Update domain value check
>
> Kyle Zeng <[email protected]>
> ipv4: fix null-deref in ipv4_link_failure
>
> Vinicius Costa Gomes <[email protected]>
> igc: Fix infinite initialization loop with early XDP redirect
>
> David Christensen <[email protected]>
> ionic: fix 16bit math issue when PAGE_SIZE >= 64KB
>
> Ivan Vecera <[email protected]>
> i40e: Fix VF VLAN offloading when port VLAN is configured
>
> Mateusz Palczewski <[email protected]>
> i40e: Add VF VLAN pruning
>
> Radoslaw Tyl <[email protected]>
> iavf: do not process adminq tasks when __IAVF_IN_REMOVE_TASK is set
>
> Shengjiu Wang <[email protected]>
> ASoC: imx-audmix: Fix return error with devm_clk_get()
>
> Sasha Neftin <[email protected]>
> net/core: Fix ETH_P_1588 flow dissector
>
> Sabrina Dubroca <[email protected]>
> selftests: tls: swap the TX and RX sockets in some tests
>
> Toke Høiland-Jørgensen <[email protected]>
> bpf: Avoid deadlock when using queue and stack maps from NMI
>
> Pablo Neira Ayuso <[email protected]>
> netfilter: nf_tables: disallow element removal on anonymous sets
>
> Jerome Brunet <[email protected]>
> ASoC: meson: spdifin: start hw on dai probe
>
> Florian Westphal <[email protected]>
> netfilter: nf_tables: fix memleak when more than 255 elements expired
>
> Pablo Neira Ayuso <[email protected]>
> netfilter: nft_set_hash: try later when GC hits EAGAIN on iteration
>
> Pablo Neira Ayuso <[email protected]>
> netfilter: nft_set_pipapo: stop GC iteration if GC transaction allocation fails
>
> Pablo Neira Ayuso <[email protected]>
> netfilter: nft_set_pipapo: call nft_trans_gc_queue_sync() in catchall GC
>
> Pablo Neira Ayuso <[email protected]>
> netfilter: nft_set_rbtree: use read spinlock to avoid datapath contention
>
> Pablo Neira Ayuso <[email protected]>
> netfilter: nft_set_rbtree: skip sync GC for new elements in this transaction
>
> Florian Westphal <[email protected]>
> netfilter: nf_tables: defer gc run if previous batch is still pending
>
> Pablo Neira Ayuso <[email protected]>
> netfilter: nf_tables: use correct lock to protect gc_list
>
> Pablo Neira Ayuso <[email protected]>
> netfilter: nf_tables: GC transaction race with abort path
>
> Pablo Neira Ayuso <[email protected]>
> netfilter: nf_tables: GC transaction race with netns dismantle
>
> Pablo Neira Ayuso <[email protected]>
> netfilter: nf_tables: fix GC transaction races with netns and netlink event exit path
>
> Florian Westphal <[email protected]>
> netfilter: nf_tables: don't fail inserts if duplicate has expired
>
> Pablo Neira Ayuso <[email protected]>
> netfilter: nf_tables: remove busy mark and gc batch API
>
> Pablo Neira Ayuso <[email protected]>
> netfilter: nft_set_hash: mark set element as dead when deleting from packet path
>
> Pablo Neira Ayuso <[email protected]>
> netfilter: nf_tables: adapt set backend to use GC transaction API
>
> Pablo Neira Ayuso <[email protected]>
> netfilter: nf_tables: GC transaction API to avoid race with control plane
>
> Florian Westphal <[email protected]>
> netfilter: nf_tables: don't skip expired elements during walk
>
> Steven Rostedt (Google) <[email protected]>
> tracing: Have event inject files inc the trace array ref count
>
> Jan Kara <[email protected]>
> ext4: do not let fstrim block system suspend
>
> Jan Kara <[email protected]>
> ext4: move setting of trimmed bit into ext4_try_to_trim_range()
>
> Kemeng Shi <[email protected]>
> ext4: replace the traditional ternary conditional operator with with max()/min()
>
> Lukas Czerner <[email protected]>
> ext4: change s_last_trim_minblks type to unsigned long
>
> Lukas Bulwahn <[email protected]>
> ext4: scope ret locally in ext4_try_to_trim_range()
>
> Szuying Chen <[email protected]>
> ata: libahci: clear pending interrupt status
>
> Hannes Reinecke <[email protected]>
> ata: ahci: Drop pointless VPRINTK() calls and convert the remaining ones
>
> Steven Rostedt (Google) <[email protected]>
> tracing: Increase trace array ref count on enable and filter files
>
> John Keeping <[email protected]>
> tracing: Make trace_marker{,_raw} stream-like
>
> Olga Kornievskaia <[email protected]>
> NFSv4.1: fix pnfs MDS=DS session trunking
>
> Olga Kornievskaia <[email protected]>
> NFSv4.1: use EXCHGID4_FLAG_USE_PNFS_DS for DS server
>
> Trond Myklebust <[email protected]>
> SUNRPC: Mark the cred for revalidation if the server rejects it
>
> Trond Myklebust <[email protected]>
> NFS/pNFS: Report EINVAL errors from connect() to the server
>
> Trond Myklebust <[email protected]>
> NFS: More fixes for nfs_direct_write_reschedule_io()
>
> Trond Myklebust <[email protected]>
> NFS: Use the correct commit info in nfs_join_page_group()
>
>
> -------------
>
> Diffstat:
>
> Makefile | 4 +-
> arch/arm/boot/dts/am33xx.dtsi | 5 +-
> arch/arm/boot/dts/am3517.dtsi | 5 +-
> arch/arm/boot/dts/am4372.dtsi | 5 +-
> arch/arm/boot/dts/artpec6-devboard.dts | 9 +-
> arch/arm/boot/dts/dm814x.dtsi | 6 +-
> arch/arm/boot/dts/dm816x.dtsi | 6 +-
> arch/arm/boot/dts/dra62x.dtsi | 6 +-
> arch/arm/boot/dts/dra7-dspeve-thermal.dtsi | 5 +-
> arch/arm/boot/dts/dra7-iva-thermal.dtsi | 5 +-
> arch/arm/boot/dts/imx6q-gk802.dts | 9 +-
> arch/arm/boot/dts/motorola-mapphone-common.dtsi | 4 +-
> arch/arm/boot/dts/omap-gpmc-smsc911x.dtsi | 6 +-
> arch/arm/boot/dts/omap-gpmc-smsc9221.dtsi | 6 +-
> arch/arm/boot/dts/omap2.dtsi | 5 +-
> arch/arm/boot/dts/omap2420.dtsi | 5 +-
> arch/arm/boot/dts/omap2430.dtsi | 5 +-
> arch/arm/boot/dts/omap3-cm-t3517.dts | 12 +-
> arch/arm/boot/dts/omap3-cpu-thermal.dtsi | 8 +-
> arch/arm/boot/dts/omap3-gta04.dtsi | 6 +-
> arch/arm/boot/dts/omap3-ldp.dts | 2 +-
> arch/arm/boot/dts/omap3-n900.dts | 38 +-
> arch/arm/boot/dts/omap3-zoom3.dts | 44 +--
> arch/arm/boot/dts/omap3.dtsi | 5 +-
> arch/arm/boot/dts/omap34xx.dtsi | 5 +-
> arch/arm/boot/dts/omap36xx.dtsi | 5 +-
> arch/arm/boot/dts/omap4-cpu-thermal.dtsi | 34 +-
> arch/arm/boot/dts/omap443x.dtsi | 6 +-
> arch/arm/boot/dts/omap4460.dtsi | 6 +-
> arch/arm/boot/dts/omap5-cm-t54.dts | 56 +--
> arch/arm/boot/dts/omap5-core-thermal.dtsi | 5 +-
> arch/arm/boot/dts/omap5-gpu-thermal.dtsi | 5 +-
> arch/arm/boot/dts/orion5x-lacie-d2-network.dts | 5 +-
> .../dts/orion5x-lacie-ethernet-disk-mini-v2.dts | 9 +-
> .../boot/dts/orion5x-maxtor-shared-storage-2.dts | 5 +-
> arch/arm/boot/dts/orion5x-mv88f5181.dtsi | 9 +-
> arch/arm/boot/dts/orion5x-mv88f5182.dtsi | 9 +-
> arch/arm/boot/dts/orion5x-netgear-wnr854t.dts | 9 +-
> arch/arm/boot/dts/orion5x-rd88f5182-nas.dts | 9 +-
> arch/arm/boot/dts/orion5x.dtsi | 9 +-
> arch/arm/include/asm/hardware/cache-aurora-l2.h | 5 +-
> arch/arm/include/asm/hardware/cache-feroceon-l2.h | 6 +-
> arch/arm/include/asm/hardware/cache-tauros2.h | 5 +-
> arch/arm/mach-davinci/board-da830-evm.c | 6 +-
> arch/arm/mach-davinci/board-da850-evm.c | 6 +-
> arch/arm/mach-davinci/board-dm355-evm.c | 6 +-
> arch/arm/mach-davinci/board-dm355-leopard.c | 5 +-
> arch/arm/mach-davinci/board-dm644x-evm.c | 6 +-
> arch/arm/mach-davinci/board-dm646x-evm.c | 7 +-
> arch/arm/mach-davinci/board-mityomapl138.c | 5 +-
> arch/arm/mach-davinci/board-neuros-osd2.c | 5 +-
> arch/arm/mach-davinci/board-omapl138-hawk.c | 5 +-
> arch/arm/mach-davinci/common.c | 6 +-
> arch/arm/mach-davinci/cpuidle.h | 5 +-
> arch/arm/mach-davinci/da830.c | 6 +-
> arch/arm/mach-davinci/da850.c | 6 +-
> arch/arm/mach-davinci/dm355.c | 6 +-
> arch/arm/mach-davinci/dm644x.c | 6 +-
> arch/arm/mach-davinci/dm646x.c | 6 +-
> arch/arm/mach-davinci/include/mach/common.h | 6 +-
> arch/arm/mach-davinci/include/mach/cputype.h | 6 +-
> arch/arm/mach-davinci/include/mach/da8xx.h | 6 +-
> arch/arm/mach-davinci/include/mach/hardware.h | 6 +-
> arch/arm/mach-davinci/include/mach/serial.h | 6 +-
> arch/arm/mach-davinci/mux.c | 6 +-
> arch/arm/mach-davinci/mux.h | 6 +-
> arch/arm/mach-davinci/pm_domain.c | 5 +-
> arch/arm/mach-dove/bridge-regs.h | 9 +-
> arch/arm/mach-dove/cm-a510.c | 5 +-
> arch/arm/mach-dove/common.c | 5 +-
> arch/arm/mach-dove/common.h | 5 +-
> arch/arm/mach-dove/dove-db-setup.c | 5 +-
> arch/arm/mach-dove/dove.h | 9 +-
> arch/arm/mach-dove/irq.c | 5 +-
> arch/arm/mach-dove/irqs.h | 9 +-
> arch/arm/mach-dove/mpp.c | 5 +-
> arch/arm/mach-dove/pcie.c | 5 +-
> arch/arm/mach-dove/pm.h | 6 +-
> arch/arm/mach-lpc18xx/board-dt.c | 5 +-
> arch/arm/mach-lpc32xx/pm.c | 6 +-
> arch/arm/mach-lpc32xx/suspend.S | 6 +-
> arch/arm/mach-mv78xx0/bridge-regs.h | 6 +-
> arch/arm/mach-mv78xx0/buffalo-wxl-setup.c | 5 +-
> arch/arm/mach-mv78xx0/common.c | 5 +-
> arch/arm/mach-mv78xx0/common.h | 5 +-
> arch/arm/mach-mv78xx0/db78x00-bp-setup.c | 5 +-
> arch/arm/mach-mv78xx0/irq.c | 5 +-
> arch/arm/mach-mv78xx0/irqs.h | 9 +-
> arch/arm/mach-mv78xx0/mpp.c | 5 +-
> arch/arm/mach-mv78xx0/mpp.h | 6 +-
> arch/arm/mach-mv78xx0/mv78xx0.h | 5 +-
> arch/arm/mach-mv78xx0/pcie.c | 5 +-
> arch/arm/mach-mv78xx0/rd78x00-masa-setup.c | 5 +-
> arch/arm/mach-mvebu/armada-370-xp.h | 5 +-
> arch/arm/mach-mvebu/board-v7.c | 5 +-
> arch/arm/mach-mvebu/coherency.c | 5 +-
> arch/arm/mach-mvebu/coherency.h | 6 +-
> arch/arm/mach-mvebu/coherency_ll.S | 5 +-
> arch/arm/mach-mvebu/common.h | 5 +-
> arch/arm/mach-mvebu/cpu-reset.c | 5 +-
> arch/arm/mach-mvebu/dove.c | 5 +-
> arch/arm/mach-mvebu/headsmp-a9.S | 5 +-
> arch/arm/mach-mvebu/headsmp.S | 5 +-
> arch/arm/mach-mvebu/kirkwood.c | 5 +-
> arch/arm/mach-mvebu/kirkwood.h | 5 +-
> arch/arm/mach-mvebu/mvebu-soc-id.c | 5 +-
> arch/arm/mach-mvebu/mvebu-soc-id.h | 5 +-
> arch/arm/mach-mvebu/platsmp-a9.c | 5 +-
> arch/arm/mach-mvebu/platsmp.c | 5 +-
> arch/arm/mach-mvebu/pm-board.c | 5 +-
> arch/arm/mach-mvebu/pm.c | 5 +-
> arch/arm/mach-mvebu/pmsu.c | 5 +-
> arch/arm/mach-mvebu/pmsu.h | 5 +-
> arch/arm/mach-mvebu/pmsu_ll.S | 5 +-
> arch/arm/mach-mvebu/system-controller.c | 5 +-
> arch/arm/mach-omap1/include/mach/mtd-xip.h | 6 +-
> arch/arm/mach-omap1/pm_bus.c | 6 +-
> arch/arm/mach-omap2/prcm43xx.h | 5 +-
> arch/arm/mach-omap2/vc.c | 6 +-
> arch/arm/mach-orion5x/board-d2net.c | 5 +-
> arch/arm/mach-orion5x/board-dt.c | 5 +-
> arch/arm/mach-orion5x/board-rd88f5182.c | 5 +-
> arch/arm/mach-orion5x/bridge-regs.h | 9 +-
> arch/arm/mach-orion5x/common.c | 5 +-
> arch/arm/mach-orion5x/db88f5281-setup.c | 5 +-
> arch/arm/mach-orion5x/irq.c | 5 +-
> arch/arm/mach-orion5x/irqs.h | 5 +-
> arch/arm/mach-orion5x/kurobox_pro-setup.c | 5 +-
> arch/arm/mach-orion5x/ls_hgl-setup.c | 5 +-
> arch/arm/mach-orion5x/mpp.c | 5 +-
> arch/arm/mach-orion5x/net2big-setup.c | 6 +-
> arch/arm/mach-orion5x/orion5x.h | 5 +-
> arch/arm/mach-orion5x/pci.c | 5 +-
> arch/arm/mach-orion5x/rd88f5181l-fxo-setup.c | 5 +-
> arch/arm/mach-orion5x/rd88f5181l-ge-setup.c | 5 +-
> arch/arm/mach-orion5x/rd88f5182-setup.c | 5 +-
> arch/arm/mach-orion5x/rd88f6183ap-ge-setup.c | 5 +-
> arch/arm/mach-orion5x/ts78xx-setup.c | 5 +-
> arch/arm/mach-orion5x/wnr854t-setup.c | 9 +-
> arch/arm/mach-orion5x/wrt350n-v2-setup.c | 9 +-
> arch/arm/mach-pxa/eseries.c | 7 +-
> arch/arm/mach-pxa/standby.S | 6 +-
> arch/arm/mach-spear/generic.h | 5 +-
> arch/arm/mach-spear/include/mach/misc_regs.h | 5 +-
> arch/arm/mach-spear/include/mach/spear.h | 5 +-
> arch/arm/mach-spear/pl080.c | 5 +-
> arch/arm/mach-spear/pl080.h | 5 +-
> arch/arm/mach-spear/restart.c | 5 +-
> arch/arm/mach-spear/spear1310.c | 5 +-
> arch/arm/mach-spear/spear1340.c | 5 +-
> arch/arm/mach-spear/spear13xx.c | 5 +-
> arch/arm/mach-spear/spear300.c | 5 +-
> arch/arm/mach-spear/spear310.c | 5 +-
> arch/arm/mach-spear/spear320.c | 5 +-
> arch/arm/mach-spear/spear3xx.c | 5 +-
> arch/arm/mach-spear/spear6xx.c | 5 +-
> arch/arm/mach-spear/time.c | 5 +-
> arch/arm/mm/cache-feroceon-l2.c | 5 +-
> arch/arm/mm/cache-tauros2.c | 5 +-
> arch/mips/alchemy/devboards/db1000.c | 4 +
> arch/mips/alchemy/devboards/db1200.c | 6 +
> arch/mips/alchemy/devboards/db1300.c | 4 +
> arch/parisc/include/asm/ropes.h | 3 +
> arch/parisc/kernel/drivers.c | 2 +-
> arch/parisc/kernel/irq.c | 2 +-
> arch/powerpc/kernel/hw_breakpoint.c | 16 +-
> arch/powerpc/kernel/hw_breakpoint_constraints.c | 7 +-
> arch/powerpc/perf/hv-24x7.c | 2 +-
> arch/x86/kernel/cpu/bugs.c | 4 +-
> arch/x86/kernel/cpu/common.c | 2 +-
> arch/xtensa/boot/Makefile | 3 +-
> arch/xtensa/boot/lib/zmem.c | 5 +-
> arch/xtensa/include/asm/core.h | 4 +
> arch/xtensa/platforms/iss/network.c | 4 +-
> drivers/ata/ahci.c | 111 +++---
> drivers/ata/ahci_brcm.c | 2 +-
> drivers/ata/ahci_xgene.c | 4 -
> drivers/ata/libahci.c | 49 +--
> drivers/ata/libata-core.c | 47 ++-
> drivers/ata/libata-eh.c | 13 +-
> drivers/ata/libata-sata.c | 2 +-
> drivers/ata/libata-scsi.c | 2 +-
> drivers/ata/libata-transport.c | 9 +-
> drivers/ata/libata.h | 2 +
> drivers/bus/ti-sysc.c | 31 +-
> drivers/char/agp/parisc-agp.c | 2 -
> drivers/clk/tegra/clk-bpmp.c | 2 +-
> drivers/firmware/imx/imx-dsp.c | 1 +
> drivers/gpio/gpio-pmic-eic-sprd.c | 1 +
> drivers/gpio/gpio-tb10x.c | 6 +-
> drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 17 +-
> .../amd/display/dc/dce110/dce110_hw_sequencer.c | 4 +-
> drivers/gpu/drm/bridge/ti-sn65dsi83.c | 4 +-
> drivers/gpu/drm/meson/meson_encoder_hdmi.c | 2 +
> drivers/i2c/busses/i2c-i801.c | 1 +
> drivers/i2c/busses/i2c-npcm7xx.c | 17 +-
> drivers/i2c/muxes/i2c-demux-pinctrl.c | 4 +
> drivers/i2c/muxes/i2c-mux-gpio.c | 47 +--
> .../serio/{i8042-x86ia64io.h => i8042-acpipnpio.h} | 13 +-
> drivers/input/serio/i8042.h | 2 +-
> drivers/interconnect/core.c | 1 +
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 27 +-
> drivers/media/common/videobuf2/frame_vector.c | 6 +-
> drivers/misc/cardreader/rts5227.c | 55 +--
> drivers/misc/cardreader/rts5228.c | 57 +--
> drivers/misc/cardreader/rts5249.c | 56 +--
> drivers/misc/cardreader/rts5260.c | 43 +--
> drivers/misc/cardreader/rts5261.c | 52 +--
> drivers/misc/cardreader/rtsx_pcr.c | 51 ++-
> drivers/net/ethernet/amazon/ena/ena_netdev.c | 3 +
> drivers/net/ethernet/broadcom/bnxt/bnxt.c | 5 +
> drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 9 +
> .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 13 +-
> drivers/net/ethernet/intel/i40e/i40e.h | 1 +
> drivers/net/ethernet/intel/i40e/i40e_ethtool.c | 9 +
> drivers/net/ethernet/intel/i40e/i40e_main.c | 138 ++++++-
> drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 16 +-
> drivers/net/ethernet/intel/iavf/iavf_main.c | 3 +-
> drivers/net/ethernet/intel/igc/igc_ethtool.c | 31 +-
> drivers/net/ethernet/intel/igc/igc_main.c | 2 +-
> drivers/net/ethernet/pensando/ionic/ionic_dev.h | 1 +
> drivers/net/ethernet/pensando/ionic/ionic_txrx.c | 10 +-
> drivers/net/team/team.c | 10 +-
> drivers/net/thunderbolt.c | 3 +-
> drivers/nvme/host/pci.c | 121 ++++---
> drivers/parisc/iosapic.c | 4 +-
> drivers/parisc/iosapic_private.h | 4 +-
> drivers/platform/mellanox/Kconfig | 1 +
> drivers/platform/x86/intel_scu_ipc.c | 66 ++--
> drivers/power/supply/ucs1002_power.c | 3 +-
> drivers/scsi/pm8001/pm8001_hwi.c | 2 +-
> drivers/scsi/pm8001/pm80xx_hwi.c | 4 +-
> drivers/scsi/qedf/qedf_io.c | 10 +-
> drivers/scsi/qedf/qedf_main.c | 7 +-
> drivers/scsi/qla2xxx/qla_def.h | 3 +
> drivers/scsi/qla2xxx/qla_init.c | 5 +-
> drivers/scsi/qla2xxx/qla_inline.h | 58 +++
> drivers/scsi/qla2xxx/qla_isr.c | 12 +-
> drivers/scsi/qla2xxx/qla_nvme.c | 4 +
> drivers/scsi/qla2xxx/qla_os.c | 6 +
> drivers/scsi/qla2xxx/qla_target.c | 3 +-
> drivers/scsi/qla2xxx/tcm_qla2xxx.c | 4 +-
> drivers/scsi/ufs/ufshcd.c | 6 +-
> drivers/soc/imx/soc-imx8m.c | 10 +
> drivers/spi/spi-nxp-fspi.c | 7 +
> drivers/spi/spi-stm32.c | 8 +
> drivers/spi/spi-sun6i.c | 31 +-
> drivers/tty/n_gsm.c | 4 +-
> drivers/tty/serial/8250/8250_port.c | 5 +-
> drivers/video/fbdev/Kconfig | 2 +-
> drivers/watchdog/iTCO_wdt.c | 26 +-
> fs/binfmt_elf_fdpic.c | 5 +-
> fs/btrfs/delayed-inode.c | 7 +-
> fs/btrfs/extent_io.c | 8 +-
> fs/btrfs/super.c | 2 +-
> fs/cifs/inode.c | 2 +-
> fs/cifs/smb2ops.c | 6 +-
> fs/ext4/ext4.h | 2 +-
> fs/ext4/mballoc.c | 67 ++--
> fs/nfs/direct.c | 25 +-
> fs/nfs/flexfilelayout/flexfilelayout.c | 1 +
> fs/nfs/nfs4client.c | 9 +-
> fs/nfs/nfs4proc.c | 4 +
> fs/nfs/write.c | 23 +-
> fs/nilfs2/gcinode.c | 6 +-
> fs/proc/task_nommu.c | 27 +-
> fs/xfs/scrub/common.c | 25 --
> fs/xfs/scrub/common.h | 2 -
> fs/xfs/scrub/fscounters.c | 13 +-
> fs/xfs/scrub/scrub.c | 2 -
> fs/xfs/scrub/scrub.h | 1 -
> fs/xfs/xfs_icache.c | 92 +++--
> fs/xfs/xfs_icache.h | 1 +
> fs/xfs/xfs_mount.h | 5 +-
> fs/xfs/xfs_qm_syscalls.c | 9 +-
> fs/xfs/xfs_super.c | 12 +-
> fs/xfs/xfs_trace.h | 1 +
> include/linux/btf_ids.h | 2 +-
> include/linux/cgroup.h | 3 +-
> include/linux/if_team.h | 2 +
> include/linux/libata.h | 4 +-
> include/linux/nfs_fs_sb.h | 1 +
> include/linux/nfs_page.h | 4 +-
> include/linux/seqlock.h | 2 +-
> include/net/netfilter/nf_tables.h | 127 +++----
> include/uapi/linux/bpf.h | 4 +-
> io_uring/io_uring.c | 2 +-
> kernel/bpf/queue_stack_maps.c | 21 +-
> kernel/dma/debug.c | 20 +-
> kernel/sched/core.c | 2 +-
> kernel/sched/cpuacct.c | 4 +-
> kernel/sched/cpupri.c | 1 +
> kernel/sched/idle.c | 1 +
> kernel/trace/ring_buffer.c | 10 +
> kernel/trace/trace.c | 45 ++-
> kernel/trace/trace.h | 2 +
> kernel/trace/trace_events.c | 6 +-
> kernel/trace/trace_events_inject.c | 3 +-
> net/bridge/br_forward.c | 4 +-
> net/bridge/br_input.c | 4 +-
> net/core/flow_dissector.c | 2 +-
> net/dccp/ipv4.c | 9 +-
> net/dccp/ipv6.c | 9 +-
> net/ipv4/route.c | 4 +-
> net/ncsi/ncsi-aen.c | 5 +
> net/netfilter/ipset/ip_set_core.c | 12 +-
> net/netfilter/nf_tables_api.c | 400 +++++++++++++++++----
> net/netfilter/nft_exthdr.c | 110 +++++-
> net/netfilter/nft_set_hash.c | 87 +++--
> net/netfilter/nft_set_pipapo.c | 71 ++--
> net/netfilter/nft_set_rbtree.c | 161 +++++----
> net/rds/rdma_transport.c | 12 +-
> net/sunrpc/clnt.c | 15 +-
> security/smack/smack.h | 1 +
> security/smack/smack_lsm.c | 65 +++-
> sound/pci/hda/hda_intel.c | 1 +
> sound/soc/fsl/imx-audmix.c | 2 +-
> sound/soc/fsl/imx-pcm-rpmsg.c | 1 +
> sound/soc/fsl/imx-rpmsg.c | 8 +
> sound/soc/meson/axg-spdifin.c | 49 +--
> tools/build/Makefile.build | 10 +
> tools/include/linux/btf_ids.h | 2 +-
> tools/include/uapi/linux/bpf.h | 4 +-
> tools/perf/Makefile.config | 19 +
> tools/perf/Makefile.perf | 1 +
> tools/perf/pmu-events/Build | 19 +-
> tools/perf/pmu-events/empty-pmu-events.c | 158 ++++++++
> tools/perf/util/Build | 6 +
> tools/perf/util/metricgroup.c | 3 +
> .../ftrace/test.d/instances/instance-event.tc | 2 +-
> tools/testing/selftests/kselftest_deps.sh | 77 +++-
> tools/testing/selftests/net/tls.c | 8 +-
> 332 files changed, 2602 insertions(+), 1896 deletions(-)
>
>

2023-10-11 16:31:52

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH 5.15 000/183] 5.15.134-rc1 review

On Wed, Oct 11, 2023 at 03:47:23PM +0200, Frederic Weisbecker wrote:
> Le Tue, Oct 10, 2023 at 06:34:35PM -0700, Paul E. McKenney a ?crit :
> > If this problem is real, fixes include:
> >
> > o Revert Liam's patch and make Tiny RCU's call_rcu() deal with
> > the problem. This is overhead and non-tinyness, but to Joel's
> > point, it might be best.
>
> But what is calling call_rcu() or start_poll_synchronize_rcu() so
> early that the CPU is not even online? (that's before boot_cpu_init() !)
>
> Deferring PF_IDLE setting might pave the way for more issues like this one,
> present or future. Though is_idle_task() returning true when the task is not
> in the idle loop but is playing the init/0 role is debatable.
>
> An alternative for tiny RCU is to force waking up ksoftirqd when call_rcu()
> is in the idle task. Since rcu_qs() during the context switch raises a softirq
> anyway. It's more overhead for start_poll_synchronize_rcu() though but do we
> expect much RCU polling in idle?

Nice!!!

This does solve the original problem with little or no additional overhead
(perhaps even with decreased overhead), and avoids the other RCU Tasks
issues.

Thanx, Paul

> diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
> index a92bce40b04b..6ab15233e2be 100644
> --- a/include/linux/interrupt.h
> +++ b/include/linux/interrupt.h
> @@ -604,6 +604,7 @@ extern void __raise_softirq_irqoff(unsigned int nr);
>
> extern void raise_softirq_irqoff(unsigned int nr);
> extern void raise_softirq(unsigned int nr);
> +extern void raise_ksoftirqd_irqsoff(unsigned int nr);
>
> DECLARE_PER_CPU(struct task_struct *, ksoftirqd);
>
> diff --git a/kernel/rcu/tiny.c b/kernel/rcu/tiny.c
> index 42f7589e51e0..872dab8b8b53 100644
> --- a/kernel/rcu/tiny.c
> +++ b/kernel/rcu/tiny.c
> @@ -189,12 +189,12 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func)
> local_irq_save(flags);
> *rcu_ctrlblk.curtail = head;
> rcu_ctrlblk.curtail = &head->next;
> - local_irq_restore(flags);
>
> if (unlikely(is_idle_task(current))) {
> /* force scheduling for rcu_qs() */
> - resched_cpu(0);
> + raise_ksoftirqd_irqsoff(RCU_SOFTIRQ);
> }
> + local_irq_restore(flags);
> }
> EXPORT_SYMBOL_GPL(call_rcu);
>
> @@ -225,10 +225,13 @@ EXPORT_SYMBOL_GPL(get_state_synchronize_rcu);
> unsigned long start_poll_synchronize_rcu(void)
> {
> unsigned long gp_seq = get_state_synchronize_rcu();
> + unsigned long flags;
>
> if (unlikely(is_idle_task(current))) {
> + local_irq_save(flags);
> /* force scheduling for rcu_qs() */
> - resched_cpu(0);
> + raise_ksoftirqd_irqsoff(RCU_SOFTIRQ);
> + local_irq_restore(flags);
> }
> return gp_seq;
> }
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index 210cf5f8d92c..ef105cbdc705 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -695,6 +695,14 @@ void __raise_softirq_irqoff(unsigned int nr)
> or_softirq_pending(1UL << nr);
> }
>
> +#ifdef CONFIG_RCU_TINY
> +void raise_ksoftirqd(unsigned int nr)
> +{
> + __raise_softirq_irqoff(nr);
> + wakeup_softirqd();
> +}
> +#endif
> +
> void open_softirq(int nr, void (*action)(struct softirq_action *))
> {
> softirq_vec[nr].action = action;
>
>
>
>

2023-10-11 17:44:39

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 5.15 000/183] 5.15.134-rc1 review

On Wed, Oct 11, 2023 at 11:58:49AM -0400, Joel Fernandes wrote:
> Hello Greg,
>
> On Sat, Oct 7, 2023 at 9:00 PM Greg Kroah-Hartman
> <[email protected]> wrote:
> >
> > This is the start of the stable review cycle for the 5.15.134 release.
> > There are 183 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Fri, 06 Oct 2023 17:51:12 +0000.
> > Anything received after that time might be too late.
> >
> > The whole patch series can be found in one patch at:
> > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.134-rc1.gz
> > or in the git tree and branch at:
> > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
> > and the diffstat can be found below.
> [...]
> > Liam R. Howlett <[email protected]>
> > kernel/sched: Modify initial boot task idle setup
> >
>
> Let us drop this patch because it caused new tasks-RCU warnings (both
> normal and rude tasks RCU) in my stable test rig. We are discussing
> the "right fix" and at that time a backport can be done.
>
> Hope Liam is also Ok with that. I am happy to do that future backport if needed.

This is already in a released kernel, a bunch of them:
5.15.134 6.1.56 6.5.6 6.6-rc3
should it be reverted from all of the stable releases, or just for
5.15.y?

thanks,

greg k-h

2023-10-16 02:28:42

by Joel Fernandes

[permalink] [raw]
Subject: Re: [PATCH 5.15 000/183] 5.15.134-rc1 review

On Wed, Oct 11, 2023 at 1:44 PM Greg Kroah-Hartman
<[email protected]> wrote:
>
> On Wed, Oct 11, 2023 at 11:58:49AM -0400, Joel Fernandes wrote:
> > Hello Greg,
> >
> > On Sat, Oct 7, 2023 at 9:00 PM Greg Kroah-Hartman
> > <[email protected]> wrote:
> > >
> > > This is the start of the stable review cycle for the 5.15.134 release.
> > > There are 183 patches in this series, all will be posted as a response
> > > to this one. If anyone has any issues with these being applied, please
> > > let me know.
> > >
> > > Responses should be made by Fri, 06 Oct 2023 17:51:12 +0000.
> > > Anything received after that time might be too late.
> > >
> > > The whole patch series can be found in one patch at:
> > > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.134-rc1.gz
> > > or in the git tree and branch at:
> > > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
> > > and the diffstat can be found below.
> > [...]
> > > Liam R. Howlett <[email protected]>
> > > kernel/sched: Modify initial boot task idle setup
> > >
> >
> > Let us drop this patch because it caused new tasks-RCU warnings (both
> > normal and rude tasks RCU) in my stable test rig. We are discussing
> > the "right fix" and at that time a backport can be done.
> >
> > Hope Liam is also Ok with that. I am happy to do that future backport if needed.
>
> This is already in a released kernel, a bunch of them:
> 5.15.134 6.1.56 6.5.6 6.6-rc3
> should it be reverted from all of the stable releases, or just for
> 5.15.y?

Just 5.15.y. The others don't have an issue with the patch per my tests.

Thanks,

- Joel

2023-10-16 08:07:10

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 5.15 000/183] 5.15.134-rc1 review

On Sun, Oct 15, 2023 at 10:25:40PM -0400, Joel Fernandes wrote:
> On Wed, Oct 11, 2023 at 1:44 PM Greg Kroah-Hartman
> <[email protected]> wrote:
> >
> > On Wed, Oct 11, 2023 at 11:58:49AM -0400, Joel Fernandes wrote:
> > > Hello Greg,
> > >
> > > On Sat, Oct 7, 2023 at 9:00 PM Greg Kroah-Hartman
> > > <[email protected]> wrote:
> > > >
> > > > This is the start of the stable review cycle for the 5.15.134 release.
> > > > There are 183 patches in this series, all will be posted as a response
> > > > to this one. If anyone has any issues with these being applied, please
> > > > let me know.
> > > >
> > > > Responses should be made by Fri, 06 Oct 2023 17:51:12 +0000.
> > > > Anything received after that time might be too late.
> > > >
> > > > The whole patch series can be found in one patch at:
> > > > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.134-rc1.gz
> > > > or in the git tree and branch at:
> > > > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
> > > > and the diffstat can be found below.
> > > [...]
> > > > Liam R. Howlett <[email protected]>
> > > > kernel/sched: Modify initial boot task idle setup
> > > >
> > >
> > > Let us drop this patch because it caused new tasks-RCU warnings (both
> > > normal and rude tasks RCU) in my stable test rig. We are discussing
> > > the "right fix" and at that time a backport can be done.
> > >
> > > Hope Liam is also Ok with that. I am happy to do that future backport if needed.
> >
> > This is already in a released kernel, a bunch of them:
> > 5.15.134 6.1.56 6.5.6 6.6-rc3
> > should it be reverted from all of the stable releases, or just for
> > 5.15.y?
>
> Just 5.15.y. The others don't have an issue with the patch per my tests.

Ok, now reverted, thanks.

greg k-h